I would also strip out any non-displayable tags before
you pull your 100-character synopsis.

 $s =~ s/<script.*?script>//sgi;  # remove script tags
 $s =~ s/<style.*?style>//sgi;    # remove style tags
 $s =~ s/<%.*?%>/ /sg;            # remove ASP "<%...%>"s
 $s =~ s/<!--.*?-->/ /sg;         # remove <!--comments-->
 while ($s =~ s/<[^<>]*>/ /sg){}  # remove other nested "<...>"s
 $s =~ s/&#?\w+[;\s]/ /g;         # remove "&#9;", "&nbsp;", etc.
 $synopsis = substr $s,0,100;     # pull first 100 chars

--Steve


> # get 100 chars from body.
> if ( $s =~ /<\s*body\s*(.{100})/is ) { 
>    print "hun: $1.\n"; 
> }
> else { print "problem getting 100.\n" }

_______________________________________________
Perl-Win32-Web mailing list
[EMAIL PROTECTED]
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs

Reply via email to