Re: matching patterns after regex?

2009-08-12 Thread Martin
On Aug 12, 12:53 pm, Bernard bernard.ch...@gmail.com wrote:
 On 12 août, 06:15, Martin mdeka...@gmail.com wrote:



  Hi,

  I have a string (see below) and ideally I would like to pull out the
  decimal number which follows the bounding coordinate information. For
  example ideal from this string I would return...

  s = '\nGROUP  = ARCHIVEDMETADATA\n
  GROUPTYPE= MASTERGROUP\n\n  GROUP  =
  BOUNDINGRECTANGLE\n\nOBJECT =
  NORTHBOUNDINGCOORDINATE\n  NUM_VAL  = 1\n
  VALUE= 19.82039\nEND_OBJECT =
  NORTHBOUNDINGCOORDINATE\n\nOBJECT =
  SOUTHBOUNDINGCOORDINATE\n  NUM_VAL  = 1\n
  VALUE= 9.910197\nEND_OBJECT =
  SOUTHBOUNDINGCOORDINATE\n\nOBJECT =
  EASTBOUNDINGCOORDINATE\n  NUM_VAL  = 1\n
  VALUE= 10.6506458717851\nEND_OBJECT =
  EASTBOUNDINGCOORDINATE\n\nOBJECT =
  WESTBOUNDINGCOORDINATE\n  NUM_VAL  = 1\n
  VALUE= 4.3188348375893e-15\nEND_OBJECT
  = WESTBOUNDINGCOORDINATE\n\n  END_GROUP

  NORTHBOUNDINGCOORDINATE = 19.82039
  SOUTHBOUNDINGCOORDINATE = 9.910197
  EASTBOUNDINGCOORDINATE = 10.6506458717851
  WESTBOUNDINGCOORDINATE = 4.3188348375893e-15

  so far I have only managed to extract the numbers by doing re.findall
  ([\d.]*\d, s), which returns

  ['1',
   '19.82039',
   '1',
   '9.910197',
   '1',
   '10.6506458717851',
   '1',
   '4.3188348375893',
   '15',
  etc.

  Now the first problem that I can see is that my string match chops off
  the e-15 part and I am not sure how to incorporate the potential for
  that in my pattern match. Does anyone have any suggestions as to how I
  could also match this? Ideally I would have a statement which printed
  the number between the two bounding coordinate strings for example

  NORTHBOUNDINGCOORDINATE\n  NUM_VAL  = 1\n
  VALUE= 19.82039\nEND_OBJECT =
  NORTHBOUNDINGCOORDINATE\n\n

  Something that matched NORTHBOUNDINGCOORDINATE and printed the
  decimal number before it hit the next string
  NORTHBOUNDINGCOORDINATE. But I am not sure how to do this. any
  suggestions would be appreciated.

  Many thanks

  Martin

 Hey Martin,

 here's a regex I've just tested : (\w+COORDINATE).*\s+VALUE\s+=\s([\d\.
 \w-]+)

 the first match corresponds to the whateverBOUNDINGCOORDINATE and the
 second match is the value.

 please provide some more entries if you'd like me to test my regex
 some more :)

 cheers

 Bernard

Thanks Bernard it doesn't seem to be working for me...

I tried

re.findall((\w+COORDINATE).*\s+VALUE\s+=\s([\d\.\w-]+),s)

is that what you meant? Apologies if not, that results in a syntax
error:

In [557]: re.findall((\w+COORDINATE).*\s+VALUE\s+=\s([\d\.\w-]+),s)

   File ipython console, line 1
 re.findall((\w+COORDINATE).*\s+VALUE\s+=\s([\d\.\w-]+),s)
  ^
SyntaxError: unexpected character after line continuation character

Thanks



-- 
http://mail.python.org/mailman/listinfo/python-list


Re: matching patterns after regex?

2009-08-12 Thread Steven D'Aprano
On Wed, 12 Aug 2009 05:12:22 -0700, Martin wrote:

 I tried
 
 re.findall((\w+COORDINATE).*\s+VALUE\s+=\s([\d\.\w-]+),s)

You need to put quotes around strings.

In this case, because you're using regular expressions, you should use a 
raw string:

re.findall(r(\w+COORDINATE).*\s+VALUE\s+=\s([\d\.\w-]+),s)

will probably work.





-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: matching patterns after regex?

2009-08-12 Thread Martin
On Aug 12, 1:23 pm, Steven D'Aprano st...@remove-this-
cybersource.com.au wrote:
 On Wed, 12 Aug 2009 05:12:22 -0700, Martin wrote:
  I tried

  re.findall((\w+COORDINATE).*\s+VALUE\s+=\s([\d\.\w-]+),s)

 You need to put quotes around strings.

 In this case, because you're using regular expressions, you should use a
 raw string:

 re.findall(r(\w+COORDINATE).*\s+VALUE\s+=\s([\d\.\w-]+),s)

 will probably work.

 --
 Steven

Thanks I see.

so I tried it and if I use it as it is, it matches the first instance:
I
n [594]: re.findall(r(\w+COORDINATE).*\s+VALUE\s+=\s([\d\.\w-]+),s)
Out[594]: [('NORTHBOUNDINGCOORDINATE', '1')]

So I adjusted the first part of the regex, on the basis I could sub
NORTH for SOUTH etc.

In [595]: re.findall(r(NORTHBOUNDINGCOORDINATE).*\s+VALUE\s+=\s([\d\.
\w-]+),s)
Out[595]: [('NORTHBOUNDINGCOORDINATE', '1')]

But in both cases it doesn't return the decimal value rather the value
that comes after NUM_VAL = , rather than VALUE = ?


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: matching patterns after regex?

2009-08-12 Thread Martin
On Aug 12, 1:42 pm, Martin mdeka...@gmail.com wrote:
 On Aug 12, 1:23 pm, Steven D'Aprano st...@remove-this-



 cybersource.com.au wrote:
  On Wed, 12 Aug 2009 05:12:22 -0700, Martin wrote:
   I tried

   re.findall((\w+COORDINATE).*\s+VALUE\s+=\s([\d\.\w-]+),s)

  You need to put quotes around strings.

  In this case, because you're using regular expressions, you should use a
  raw string:

  re.findall(r(\w+COORDINATE).*\s+VALUE\s+=\s([\d\.\w-]+),s)

  will probably work.

  --
  Steven

 Thanks I see.

 so I tried it and if I use it as it is, it matches the first instance:
 I
 n [594]: re.findall(r(\w+COORDINATE).*\s+VALUE\s+=\s([\d\.\w-]+),s)
 Out[594]: [('NORTHBOUNDINGCOORDINATE', '1')]

 So I adjusted the first part of the regex, on the basis I could sub
 NORTH for SOUTH etc.

 In [595]: re.findall(r(NORTHBOUNDINGCOORDINATE).*\s+VALUE\s+=\s([\d\.
 \w-]+),s)
 Out[595]: [('NORTHBOUNDINGCOORDINATE', '1')]

 But in both cases it doesn't return the decimal value rather the value
 that comes after NUM_VAL = , rather than VALUE = ?

I think I kind of got that to work...but I am clearly not quite
understanding how it works as I tried to use it again to match
something else.

In this case I want to print the values 0.00 and 2223901.039333
from a string like this...

YDim=1200\n\t\tUpperLeftPointMtrs=(0.00,2223901.039333)\n\t\t

I tried which I though was matching the statement and printing the
decimal number after the equals sign??

re.findall(r(\w+UpperLeftPointMtrs)*=\s([\d\.\w-]+), s)

where s is the string

Many thanks for the help
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: matching patterns after regex?

2009-08-12 Thread Bernard
On 12 août, 12:43, Martin mdeka...@gmail.com wrote:
 On Aug 12, 1:42 pm, Martin mdeka...@gmail.com wrote:





  On Aug 12, 1:23 pm, Steven D'Aprano st...@remove-this-

  cybersource.com.au wrote:
   On Wed, 12 Aug 2009 05:12:22 -0700, Martin wrote:
I tried

re.findall((\w+COORDINATE).*\s+VALUE\s+=\s([\d\.\w-]+),s)

   You need to put quotes around strings.

   In this case, because you're using regular expressions, you should use a
   raw string:

   re.findall(r(\w+COORDINATE).*\s+VALUE\s+=\s([\d\.\w-]+),s)

   will probably work.

   --
   Steven

  Thanks I see.

  so I tried it and if I use it as it is, it matches the first instance:
  I
  n [594]: re.findall(r(\w+COORDINATE).*\s+VALUE\s+=\s([\d\.\w-]+),s)
  Out[594]: [('NORTHBOUNDINGCOORDINATE', '1')]

  So I adjusted the first part of the regex, on the basis I could sub
  NORTH for SOUTH etc.

  In [595]: re.findall(r(NORTHBOUNDINGCOORDINATE).*\s+VALUE\s+=\s([\d\.
  \w-]+),s)
  Out[595]: [('NORTHBOUNDINGCOORDINATE', '1')]

  But in both cases it doesn't return the decimal value rather the value
  that comes after NUM_VAL = , rather than VALUE = ?

 I think I kind of got that to work...but I am clearly not quite
 understanding how it works as I tried to use it again to match
 something else.

 In this case I want to print the values 0.00 and 2223901.039333
 from a string like this...

 YDim=1200\n\t\tUpperLeftPointMtrs=(0.00,2223901.039333)\n\t\t

 I tried which I though was matching the statement and printing the
 decimal number after the equals sign??

 re.findall(r(\w+UpperLeftPointMtrs)*=\s([\d\.\w-]+), s)

 where s is the string

 Many thanks for the help

You have to do it with 2 matches in the same regex:

regex = rUpperLeftPointMtrs=\(([\d\.]+),([\d\.]+)

The first match  is before the , and the second one is after the , :)

You should probably learn how to play with regexes.
I personnaly use a visual tool called RX Toolkit[1] that comes with
Komodo IDE.

[1] http://docs.activestate.com/komodo/4.4/regex.html
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: matching patterns after regex?

2009-08-12 Thread Mark Lawrence

Bernard wrote:

On 12 août, 12:43, Martin mdeka...@gmail.com wrote:

On Aug 12, 1:42 pm, Martin mdeka...@gmail.com wrote:






On Aug 12, 1:23 pm, Steven D'Aprano st...@remove-this-
cybersource.com.au wrote:

On Wed, 12 Aug 2009 05:12:22 -0700, Martin wrote:

I tried
re.findall((\w+COORDINATE).*\s+VALUE\s+=\s([\d\.\w-]+),s)

You need to put quotes around strings.
In this case, because you're using regular expressions, you should use a
raw string:
re.findall(r(\w+COORDINATE).*\s+VALUE\s+=\s([\d\.\w-]+),s)
will probably work.
--
Steven

Thanks I see.
so I tried it and if I use it as it is, it matches the first instance:
I
n [594]: re.findall(r(\w+COORDINATE).*\s+VALUE\s+=\s([\d\.\w-]+),s)
Out[594]: [('NORTHBOUNDINGCOORDINATE', '1')]
So I adjusted the first part of the regex, on the basis I could sub
NORTH for SOUTH etc.
In [595]: re.findall(r(NORTHBOUNDINGCOORDINATE).*\s+VALUE\s+=\s([\d\.
\w-]+),s)
Out[595]: [('NORTHBOUNDINGCOORDINATE', '1')]
But in both cases it doesn't return the decimal value rather the value
that comes after NUM_VAL = , rather than VALUE = ?

I think I kind of got that to work...but I am clearly not quite
understanding how it works as I tried to use it again to match
something else.

In this case I want to print the values 0.00 and 2223901.039333
from a string like this...

YDim=1200\n\t\tUpperLeftPointMtrs=(0.00,2223901.039333)\n\t\t

I tried which I though was matching the statement and printing the
decimal number after the equals sign??

re.findall(r(\w+UpperLeftPointMtrs)*=\s([\d\.\w-]+), s)

where s is the string

Many thanks for the help


You have to do it with 2 matches in the same regex:

regex = rUpperLeftPointMtrs=\(([\d\.]+),([\d\.]+)

The first match  is before the , and the second one is after the , :)

You should probably learn how to play with regexes.
I personnaly use a visual tool called RX Toolkit[1] that comes with
Komodo IDE.

[1] http://docs.activestate.com/komodo/4.4/regex.html

Haven't tried it myself but how about this?
http://re-try.appspot.com/

--
Kindest regards.

Mark Lawrence.

--
http://mail.python.org/mailman/listinfo/python-list


Re: matching patterns after regex?

2009-08-12 Thread Martin
On Aug 12, 10:29 pm, Mark Lawrence breamore...@yahoo.co.uk wrote:
 Bernard wrote:
  On 12 août, 12:43, Martin mdeka...@gmail.com wrote:
  On Aug 12, 1:42 pm, Martin mdeka...@gmail.com wrote:

  On Aug 12, 1:23 pm, Steven D'Aprano st...@remove-this-
  cybersource.com.au wrote:
  On Wed, 12 Aug 2009 05:12:22 -0700, Martin wrote:
  I tried
  re.findall((\w+COORDINATE).*\s+VALUE\s+=\s([\d\.\w-]+),s)
  You need to put quotes around strings.
  In this case, because you're using regular expressions, you should use a
  raw string:
  re.findall(r(\w+COORDINATE).*\s+VALUE\s+=\s([\d\.\w-]+),s)
  will probably work.
  --
  Steven
  Thanks I see.
  so I tried it and if I use it as it is, it matches the first instance:
  I
  n [594]: re.findall(r(\w+COORDINATE).*\s+VALUE\s+=\s([\d\.\w-]+),s)
  Out[594]: [('NORTHBOUNDINGCOORDINATE', '1')]
  So I adjusted the first part of the regex, on the basis I could sub
  NORTH for SOUTH etc.
  In [595]: re.findall(r(NORTHBOUNDINGCOORDINATE).*\s+VALUE\s+=\s([\d\.
  \w-]+),s)
  Out[595]: [('NORTHBOUNDINGCOORDINATE', '1')]
  But in both cases it doesn't return the decimal value rather the value
  that comes after NUM_VAL = , rather than VALUE = ?
  I think I kind of got that to work...but I am clearly not quite
  understanding how it works as I tried to use it again to match
  something else.

  In this case I want to print the values 0.00 and 2223901.039333
  from a string like this...

  YDim=1200\n\t\tUpperLeftPointMtrs=(0.00,2223901.039333)\n\t\t

  I tried which I though was matching the statement and printing the
  decimal number after the equals sign??

  re.findall(r(\w+UpperLeftPointMtrs)*=\s([\d\.\w-]+), s)

  where s is the string

  Many thanks for the help

  You have to do it with 2 matches in the same regex:

  regex = rUpperLeftPointMtrs=\(([\d\.]+),([\d\.]+)

  The first match  is before the , and the second one is after the , :)

  You should probably learn how to play with regexes.
  I personnaly use a visual tool called RX Toolkit[1] that comes with
  Komodo IDE.

  [1]http://docs.activestate.com/komodo/4.4/regex.html

 Haven't tried it myself but how about this?http://re-try.appspot.com/

 --
 Kindest regards.

 Mark Lawrence.

Thanks Mark and Bernard. I have managed to get it working and I
appreciate the help with understanding the syntax. The web links are
also very useful, I'll give them a go.

Martin
-- 
http://mail.python.org/mailman/listinfo/python-list