Very Basic Web Scrape

2006-04-07 Thread kc68
I'm trying to learn web scraping and am stopped at the basic point of scraping a portion of a web page. I'm able to scrape a full page and save it as *.xml or *.htm, and I think I understand regex, but the following fails: ** # Prints a portion of a red cross web page to a new

Re: Very Basic Web Scrape

2006-04-07 Thread Oliver Block
Hi, I understand regex, but the following fails: open PAGE, 'c://redcross.htm'; while( my $line = PAGE ) { $line =~ /Health and Safety Classes/ print $1\n; } What fails? Your forget a ';' after the regex but I guess that's not what you mean!? :) cu, Oliver -- To unsubscribe, e-mail:

Re: Very Basic Web Scrape

2006-04-07 Thread kc68
On Fri, 07 Apr 2006 16:02:53 -0400, Oliver Block [EMAIL PROTECTED] wrote: Hi, I understand regex, but the following fails: open PAGE, 'c://redcross.htm'; while( my $line = PAGE ) { $line =~ /Health and Safety Classes/ print $1\n; } What fails? Your forget a ';' after the regex but I guess

Re: Very Basic Web Scrape

2006-04-07 Thread Joshua Colson
On Fri, 2006-04-07 at 16:36 -0400, [EMAIL PROTECTED] wrote: On Fri, 07 Apr 2006 16:02:53 -0400, Oliver Block [EMAIL PROTECTED] wrote: Hi, I understand regex, but the following fails: open PAGE, 'c://redcross.htm'; while( my $line = PAGE ) { # $line =~ /Health and Safety Classes/

Re: Very Basic Web Scrape

2006-04-07 Thread Oliver Block
Am Freitag, 7. April 2006 22:36 schrieb [EMAIL PROTECTED]: I was trying to limit the result to the words /Health and Safety Classes/ that appear on the page. How do I get there? At first you need to understand regex! :) open PAGE, 'c://redcross.htm'; while( my $line = PAGE ) { $line =~

Re: Very Basic Web Scrape

2006-04-07 Thread Dave Gray
On 4/7/06, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: On Fri, 07 Apr 2006 16:02:53 -0400, Oliver Block [EMAIL PROTECTED] wrote: I understand regex, but the following fails: open PAGE, 'c://redcross.htm'; while( my $line = PAGE ) { $line =~ /Health and Safety Classes/ print $1\n; }

Re: Very Basic Web Scrape

2006-04-07 Thread Jaime Murillo
On Friday 07 April 2006 13:15, [EMAIL PROTECTED] wrote: I'm trying to learn web scraping and am stopped at the basic point of scraping a portion of a web page. I'm able to scrape a full page and save it as *.xml or *.htm, and I think I understand regex, but the following fails: