On 3 Jul 2005 10:49:03 -0700, [EMAIL PROTECTED] wrote:
>What is the best way to use regular expressions to extract information
>from the internet if one wants to search multiple pages? Let's say I
>want to search all of www.cnn.com and get a list of all the words that
>follow "Michael."
>
>(1) Is P
Python would be good for this, but if you just want a chuck an rumble
solution might be.
bash $wget -r --ignore-robots -l 0 -c -t 3 http://www.cnn.com/
bash $ grep -r "Micheal.*" ./www.cnn.com/*
Or you could do a wget/python mix
like
import sys
import re
sys.os.command("wget -r --ignore-robots
[EMAIL PROTECTED] wrote:
> What is the best way to use regular expressions to extract information
> from the internet if one wants to search multiple pages? Let's say I
> want to search all of www.cnn.com and get a list of all the words that
> follow "Michael."
>
> (1) Is Python the best language
What is the best way to use regular expressions to extract information
from the internet if one wants to search multiple pages? Let's say I
want to search all of www.cnn.com and get a list of all the words that
follow "Michael."
(1) Is Python the best language for this? (Plus is it time-efficient?