t; the collective IQ of thousands of individuals across
> the Internet is simply amazing." - Vinod Valloppillil
> http://www.catb.org/~esr/halloween/halloween4.html
> --
> http://mail.python.org/mailman/listinfo/python-list
>
>
Hi, thanks for your help..
thread is too
Michiel Overtoom wrote:
elca wrote:
im sorry ,also im not familiar with newsgroup.
It's not a newsgroup, but a mailing list. And if you're new to a certain
community you're not familiar with, it's best to lurk a few days to see
how it is used.
Pot. Kettle. Black.
comp.lang.python really i
elca wrote:
http://news.search.naver.com/search.naver?sm=tab_hty&where=news&query=korea+times&x=0&y=0
that is korea portal site and i was search keyword using 'korea times'
and i want to scrap resulted to text name with 'blogscrap_save.txt'
Aha, now we're getting somewhere.
Getting and parsin
elca wrote:
im sorry ,also im not familiar with newsgroup.
It's not a newsgroup, but a mailing list. And if you're new to a certain
community you're not familiar with, it's best to lurk a few days to see
how it is used.
so this position is bottom-posting position?
It is, but you should
d? Try to answer questions like: Where does
>>>>> your
>>>>> data come from? Is it XML or HTML? What do you want to do with it?
>>>>>
>>>>> This might help:
>>>>>
>>>>> http://www.catb.org/~esr/fa
elca schrieb:
Hi,
thanks a lot.
studying alone is tough thing :)
how can i improve my skill...
1. Stop top-posting.
2. Read documentation
3. Use the interactive prompt
cheers
Paul
paul kölle wrote:
elca schrieb:
Hello,
Hi,
following is script source which can beautifulsoup and PAMIE
e lines, only one of which gives a tiny hint
>>> on
>>> what you actually want to do. What about providing an explanation of
>>> what
>>> you want to achieve instead? Try to answer questions like: Where does
>>> your
>>> data come from? Is it XML
elca schrieb:
Hello,
Hi,
following is script source which can beautifulsoup and PAMIE work together.
but if i run this script source error was happened.
AttributeError: PAMIE instance has no attribute 'pageText'
File "C:\test12.py", line 7, in
bs = BeautifulSoup(ie.pageText())
You could
t and
> BeautifulSoup parses it', then maybe yes.
>
> Greetings,
>
> --
> "The ability of the OSS process to collect and harness
> the collective IQ of thousands of individuals across
> the Internet is simply amazing." - Vinod Valloppillil
> http://www
elca wrote:
actually what i want to parse website is some different language site.
A different website? What website? What text? Please show your actual
use case, instead of smokescreens.
so i was quote some common english website for easy understand. :)
And, did you learn somethin
ventually the whole program could be collapsed into one line:
>
> print
> BeautifulSoup(urllib2.urlopen("http://www.cnn.com";)).find("a",text="CNN
> Shop").findParent()["href"]
>
> ...but I think this is very ugly!
>
>
> &g
elca, 25.10.2009 08:46:
> im very sorry my english.
It's fairly common in this news-group that people do not have a good level
of English, so that's perfectly ok. But you should try to provide more
information in your posts. Be explicit about what you tried and what failed
(and how!), and provide
elca wrote:
yes i want to extract this text 'CNN Shop' and linked page
'http://www.turnerstoreonline.com'.
Well then.
First, we'll get the page using urrlib2:
doc=urllib2.urlopen("http://www.cnn.com";)
Then we'll feed it into the HTML parser:
soup=BeautifulSoup(doc)
Next, we'll loo
hat what you want?
>
> Greetings,
> --
> http://mail.python.org/mailman/listinfo/python-list
>
>
--
View this message in context:
http://www.nabble.com/how-can-i-use-lxml-with-win32com--tp26044339p26045811.html
Sent from the Python - python-list mailing list archive at Nabble.com.
--
http://mail.python.org/mailman/listinfo/python-list
On 25 Oct 2009, at 08:33 , elca wrote:
www.cnn.com in main website page.
for example ,if you see www.cnn.com's html source, maybe you can
find such
like line of html source.
http://www.turnerstoreonline.com/ CNN Shop
and for example if i want to extract 'CNN Shop' text in html source.
So,
eproduce
> it manually in the webbrowser so I get a clear idea what exactly
> you're trying to achieve.
>
> Greetings,
>
> --
> http://mail.python.org/mailman/listinfo/python-list
>
>
--
View this message in context:
http://www.nabble.com/how-can-i-use-lxml-with-win
On 25 Oct 2009, at 08:06 , elca wrote:
because of javascript im trying to insist use PAMIE.
I see, your problem is not with lxml or BeautifulSoup, but getting the
raw data in the first place.
i want to extract some text in CNN website with 'CNN Shop'
'Site map' in bottom of CNN website
usion for
> you, I think.
>
> Greetings,
> --
> http://mail.python.org/mailman/listinfo/python-list
>
>
--
View this message in context:
http://www.nabble.com/how-can-i-use-lxml-with-win32com--tp26044339p26045673.html
Sent from the Python - python-list mailing list archive at Nabble.com.
--
http://mail.python.org/mailman/listinfo/python-list
On 25 Oct 2009, at 07:45 , elca wrote:
i want to make web scraper.
if possible i really want to make it work together with
beautifulsoup or
lxml with PAMIE.
Scraping information from webpages falls apart in two tasks:
1. Getting the HTML data
2. Extracting information from the HTML data
nswer questions like: Where does your
> data come from? Is it XML or HTML? What do you want to do with it?
>
> This might help:
>
> http://www.catb.org/~esr/faqs/smart-questions.html
>
> Stefan
> --
> http://mail.python.org/mailman/listinfo/python-list
>
>
-
Hi,
elca, 25.10.2009 02:35:
> hello...
> if anyone know..please help me !
> i really want to know...i was searched in google lot of time.
> but can't found clear soultion. and also because of my lack of python
> knowledge.
> i want to use IE.navigate function with beautifulsoup or lxml..
> if anyo
lease help me!
thanks in advance ..
--
View this message in context:
http://www.nabble.com/how-can-i-use-lxml-with-win32com--tp26044339p26044339.html
Sent from the Python - python-list mailing list archive at Nabble.com.
--
http://mail.python.org/mailman/listinfo/python-list
22 matches
Mail list logo