Re: Best Way to extract Numbers from String

Someone Something Sat, 20 Mar 2010 17:54:49 -0700

Its an extremely bad idea to use regex for HTML. You want to change one tiny
little thing and you have to write the regex all over again. if its a
throwaway script, then go ahead.
2010/3/20 Luis M. González <[email protected]>


> On Mar 20, 12:04 am, Jimbo <[email protected]> wrote:
> > Hello
> >
> > I am trying to grab some numbers from a string containing HTML text.
> > Can you suggest any good functions that I could use to do this? What
> > would be the easiest way to extract the following numbers from this
> > string...
> >
> > My String has this layout & I have commented what I want to grab:
> > [CODE] """</th>
> >                                 <td class="last">43.200 </td>
> >                                 <td class="change indicator" nowrap>0.040
> </td>
> >
> >                                                    <td>43.150 </td> #
> > I need to grab this number only
> >                                 <td>43.200 </td>
> >                                                    <td>43.130 </td> #
> > I need to grab this number only
> >                                 <td>43.290 </td>
>                 <td>43.100 </td> # I need to
> > grab this number only
> >                                 <td>7,450,447 </td>
> >                                 <td class="middle"><a
> >
> href="/asx/markets/optionPrices.do?
> > by=underlyingCode&underlyingCode=BHP&expiryDate=&optionType=">Options</
> > a></td>
> >                                 <td class="middle"><a
> >
> href="/asx/markets/warrantPrices.do?
> > by=underlyingAsxCode&underlyingCode=BHP">Warrants & Structured
> > Products</a></td>
> >                                 <td class="middle"><a
> >                                         href="/asx/markets/cfdPrices.do?
> > by=underlyingAsxCode&underlyingCode=BHP">CFDs</a></td>
> >                                 <td class="middle"><a href="
> http://hfgapps.hubb.com/asxtools/
> > Charts.aspx?
> >
> TimeFrame=D6&compare=comp_index&indicies=XJO&pma1=20&pma2=20&asxCode=BHP"><
> img
> > src="/images/chart.gif" border="0" height="15" width="15"></a>
> > </td>
> >                                 <td><a
> href="/research/announcements/status_notes.htm#XD">XD</a>
> >                                 </td>
> >                                 <td><a
> href="/asx/statistics/announcements.do?
> > by=asxCode&asxCode=BHP&timeframe=D&period=W">Recent</a>
> > </td>
> >                         </tr>"""[/CODE]
>
>
> You should use BeautifulSoup or perhaps regular expressions.
> Or if you are not very smart, lik me, just try a brute force approach:
>
> >>> for i in s.split('>'):
>        for e in i.split():
>                if '.' in e and e[0].isdigit():
>                        print (e)
>
>
> 43.200
> 0.040
> 43.150
> 43.200
> 43.130
> 43.290
> 43.100
> >>>
> --
> http://mail.python.org/mailman/listinfo/python-list
>

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Best Way to extract Numbers from String

Reply via email to