[If you think there is better news group than this one where I could get
an answer, please tell me.]
I'd like to use Mozilla to make a web crawler that fetches HTML pages
(like wget) and extracts all the DOM text nodes from a page. It would be
a stand alone application that uses whatever Mozilla component is needed
to accomplish this. (Using C++)
I'd like to get some pointers on where to start. Would the GRE be ideal
for this task? What library/component should I use to:
- browser a site automatically by folowing links
- Extract the text nodes using Mozilla's internal DOM representation of
the page.
Alexandre Leduc
- Re: Mozilla crawler Alex Leduc
- Re: Mozilla crawler Alex Leduc
