Re: Best Way to extract Numbers from String

2010-03-20 Thread Luis M . González
On Mar 20, 12:04 am, Jimbo nill...@yahoo.com wrote:
 Hello

 I am trying to grab some numbers from a string containing HTML text.
 Can you suggest any good functions that I could use to do this? What
 would be the easiest way to extract the following numbers from this
 string...

 My String has this layout  I have commented what I want to grab:
 [CODE] /th
                                 td class=last43.200 /td
                                 td class=change indicator nowrap0.040 
 /td

                                                    td43.150 /td #
 I need to grab this number only
                                 td43.200 /td
                                                    td43.130 /td #
 I need to grab this number only
                                 td43.290 /td                              
            td43.100 /td # I need to
 grab this number only
                                 td7,450,447 /td
                                 td class=middlea
                                         href=/asx/markets/optionPrices.do?
 by=underlyingCodeunderlyingCode=BHPexpiryDate=optionType=Options/
 a/td
                                 td class=middlea
                                         href=/asx/markets/warrantPrices.do?
 by=underlyingAsxCodeunderlyingCode=BHPWarrants  Structured
 Products/a/td
                                 td class=middlea
                                         href=/asx/markets/cfdPrices.do?
 by=underlyingAsxCodeunderlyingCode=BHPCFDs/a/td
                                 td class=middlea 
 href=http://hfgapps.hubb.com/asxtools/
 Charts.aspx?
 TimeFrame=D6compare=comp_indexindicies=XJOpma1=20pma2=20asxCode=BHP 
 img
 src=/images/chart.gif border=0 height=15 width=15/a
 /td
                                 tda 
 href=/research/announcements/status_notes.htm#XDXD/a
                                 /td
                                 tda href=/asx/statistics/announcements.do?
 by=asxCodeasxCode=BHPtimeframe=Dperiod=WRecent/a
 /td
                         /tr[/CODE]


You should use BeautifulSoup or perhaps regular expressions.
Or if you are not very smart, lik me, just try a brute force approach:

 for i in s.split(''):
for e in i.split():
if '.' in e and e[0].isdigit():
print (e)


43.200
0.040
43.150
43.200
43.130
43.290
43.100

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Best Way to extract Numbers from String

2010-03-20 Thread Jimbo
On Mar 20, 11:51 pm, Luis M. González luis...@gmail.com wrote:
 On Mar 20, 12:04 am, Jimbo nill...@yahoo.com wrote:





  Hello

  I am trying to grab some numbers from a string containing HTML text.
  Can you suggest any good functions that I could use to do this? What
  would be the easiest way to extract the following numbers from this
  string...

  My String has this layout  I have commented what I want to grab:
  [CODE] /th
                                  td class=last43.200 /td
                                  td class=change indicator nowrap0.040 
  /td

                                                     td43.150 /td #
  I need to grab this number only
                                  td43.200 /td
                                                     td43.130 /td #
  I need to grab this number only
                                  td43.290 /td                            
               td43.100 /td # I need to
  grab this number only
                                  td7,450,447 /td
                                  td class=middlea
                                          href=/asx/markets/optionPrices.do?
  by=underlyingCodeunderlyingCode=BHPexpiryDate=optionType=Options/
  a/td
                                  td class=middlea
                                          href=/asx/markets/warrantPrices.do?
  by=underlyingAsxCodeunderlyingCode=BHPWarrants  Structured
  Products/a/td
                                  td class=middlea
                                          href=/asx/markets/cfdPrices.do?
  by=underlyingAsxCodeunderlyingCode=BHPCFDs/a/td
                                  td class=middlea 
  href=http://hfgapps.hubb.com/asxtools/
  Charts.aspx?
  TimeFrame=D6compare=comp_indexindicies=XJOpma1=20pma2=20asxCode=BHP 
  img
  src=/images/chart.gif border=0 height=15 width=15/a
  /td
                                  tda 
  href=/research/announcements/status_notes.htm#XDXD/a
                                  /td
                                  tda 
  href=/asx/statistics/announcements.do?
  by=asxCodeasxCode=BHPtimeframe=Dperiod=WRecent/a
  /td
                          /tr[/CODE]

 You should use BeautifulSoup or perhaps regular expressions.
 Or if you are not very smart, lik me, just try a brute force approach:

  for i in s.split(''):

         for e in i.split():
                 if '.' in e and e[0].isdigit():
                         print (e)

 43.200
 0.040
 43.150
 43.200
 43.130
 43.290
 43.100



 - Hide quoted text -

 - Show quoted text -- Hide quoted text -

 - Show quoted text -

Thanks very much, I'm going to look at regular expressions but that
for your code, it shows me how I can do it iwth standard python :)
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Best Way to extract Numbers from String

2010-03-20 Thread Someone Something
Its an extremely bad idea to use regex for HTML. You want to change one tiny
little thing and you have to write the regex all over again. if its a
throwaway script, then go ahead.
2010/3/20 Luis M. González luis...@gmail.com

 On Mar 20, 12:04 am, Jimbo nill...@yahoo.com wrote:
  Hello
 
  I am trying to grab some numbers from a string containing HTML text.
  Can you suggest any good functions that I could use to do this? What
  would be the easiest way to extract the following numbers from this
  string...
 
  My String has this layout  I have commented what I want to grab:
  [CODE] /th
  td class=last43.200 /td
  td class=change indicator nowrap0.040
 /td
 
 td43.150 /td #
  I need to grab this number only
  td43.200 /td
 td43.130 /td #
  I need to grab this number only
  td43.290 /td
 td43.100 /td # I need to
  grab this number only
  td7,450,447 /td
  td class=middlea
 
 href=/asx/markets/optionPrices.do?
  by=underlyingCodeunderlyingCode=BHPexpiryDate=optionType=Options/
  a/td
  td class=middlea
 
 href=/asx/markets/warrantPrices.do?
  by=underlyingAsxCodeunderlyingCode=BHPWarrants  Structured
  Products/a/td
  td class=middlea
  href=/asx/markets/cfdPrices.do?
  by=underlyingAsxCodeunderlyingCode=BHPCFDs/a/td
  td class=middlea href=
 http://hfgapps.hubb.com/asxtools/
  Charts.aspx?
 
 TimeFrame=D6compare=comp_indexindicies=XJOpma1=20pma2=20asxCode=BHP
 img
  src=/images/chart.gif border=0 height=15 width=15/a
  /td
  tda
 href=/research/announcements/status_notes.htm#XDXD/a
  /td
  tda
 href=/asx/statistics/announcements.do?
  by=asxCodeasxCode=BHPtimeframe=Dperiod=WRecent/a
  /td
  /tr[/CODE]


 You should use BeautifulSoup or perhaps regular expressions.
 Or if you are not very smart, lik me, just try a brute force approach:

  for i in s.split(''):
for e in i.split():
if '.' in e and e[0].isdigit():
print (e)


 43.200
 0.040
 43.150
 43.200
 43.130
 43.290
 43.100
 
 --
 http://mail.python.org/mailman/listinfo/python-list

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Best Way to extract Numbers from String

2010-03-20 Thread Novocastrian_Nomad
Regular expression are very powerful, and I use them  a lot in my
paying job (unfortunately not with Python).  You are however,
basically using a second programing language, which can be difficult
to master.

Does this give you the desired result?

import re

matches = re.findall('td([\d\.,]+)\s*/td', code)
for match in matches:
print match

resulting in this output:
43.150
43.200
43.130
43.290
43.100
7,450,447
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Best Way to extract Numbers from String

2010-03-19 Thread Gabriel Genellina

En Sat, 20 Mar 2010 00:04:08 -0300, Jimbo nill...@yahoo.com escribió:


I am trying to grab some numbers from a string containing HTML text.
Can you suggest any good functions that I could use to do this? What
would be the easiest way to extract the following numbers from this
string...

My String has this layout  I have commented what I want to grab:
[CODE] /th
td class=last43.200 /td
td class=change indicator nowrap0.040 /td

   td43.150 /td #
I need to grab this number only
td43.200 /td
   td43.130 /td #
I need to grab this number only


I'd use BeautifulSoup [1] to handle bad formed HTML like that.

[1] http://www.crummy.com/software/BeautifulSoup/

--
Gabriel Genellina

--
http://mail.python.org/mailman/listinfo/python-list