Not sure why you need the second re. Once you've done the search you know
you have "/" followed by separated floats so you could just do a split() on
",".

On Sun, Aug 23, 2009 at 11:00 AM, Randolph Bentson <[email protected]>wrote:

> On Sun, Aug 23, 2009 at 09:02:53AM -0700, James Thiele wrote:
> > If I'm reading this correctly, you want to to verify that the full string
> > matches "(AB+)+" and then print it followed by the submatches of "AB+"  .
> > Combining your code with Bryan's suggestion:
> >    #!/usr/bin/env python
> >    import re
> >    ptn = re.compile("^((AB+)+)$")
> >    str = "ABABBABBBABBBBABBBBBABBBBBB"
> >    if ptn.match(str):
> >        print str, re.findall('(AB+)', str)
>
> Thanks for your help.  I had simplified my example, but this solves the
> core problem. Here's an extract from the actual data and my application
> of your suggestions:
>     #!/usr/bin/env python
>    import re
>     wpx = re.compile("WPX/(\d+)(,([-+]?\d+\.\d*e[-+]\d+))+")
>    floats = re.compile("[-+]?\d+\.\d*e[-+]\d+")
>    #
>    lines = [
>        "WPX/1,8.2954231790e+006,1.0133209480e+005,1.7395780740e-004",
>
>  "WPX/2,2.739e+06,3.301e+04,-8.822e+00,-4.688e+00,-1.443e-01,-6.109e-02",
>
>  
> "WPX/3,1.3e+5,6.2e+2,-1.7e-1,-1.8e+1,-4.3e-3,-2.1e-5,-7.4e-2,-2.6-5,7.2e-7,1.0e-6",
>        "Other stuff", ]
>    info = {"WPX":[], }
>    #
>    for line in lines:
>        mo = wpx.search(line)
>        if mo:
>
>  info["WPX"].append([int(mo.group(1))]+map(float,floats.findall(line)))
>            continue
>    #
>    # ... much later ...
>    #
>    for value in info["WPX"]:
>        print value
>
> The technique works for this case, but it seems a bit fragile.  I still
> wonder if there isn't a more robust method which would work for a messier
> collection of nested groups. Perhaps I'll have to revert to traditional
> parsing when that case appears.
>
> --
> Randolph Bentson
> [email protected]
>



-- 
No electrons were harmed in the creation of this email.

Reply via email to