On Monday, July 6, 2015 at 8:22:11 AM UTC-7, Sai Harsh Tondomker wrote:
>
> Thanks for reply.
> Could you please give me one example and where I can get enough matter to
> write the code.
>
>
Here's a little fragment of stuff I'm doing (not with Web2Py) using
Beautiful Soup:
f = codecs.open("path/myfilename",
encoding="ISO8859-1", mode="r")
soup = BeautifulSoup(f)
for x in soup.find_all("td", class_="element-title"):
if not x.contents[0]:
print "blowing up on %s" % x.name
while not isinstance(x.contents[0], NavigableString):
x = x.contents[0]
Note that I'm actually parsing HTML ("element-title" shows up in <TD>
nodes), but the general approach is the same. If you go with Beautiful
Soup, I think you'll find their tutorial is quite easy to follow. After
all, I was able to follow it, and come up with the above. The code that
follows this is essentially taking apart the text string I find in my
target node, so most of the rest of my code is Python string ops
(x.split(), etc).
I'm not familiar with the other suggestions, except that I think Beautiful
Soup builds on lxml.
/dps
--
Resources:
- http://web2py.com
- http://web2py.com/book (Documentation)
- http://github.com/web2py/web2py (Source code)
- https://code.google.com/p/web2py/issues/list (Report Issues)
---
You received this message because you are subscribed to the Google Groups
"web2py-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.