Robert Kerry wrote:
I want to use Perl to extract keywords from plaintext, don't know
whether there are some exsiting package / algorithm for doing that?
Thank you.

Regards,

Robert.
If you are attempting to "extract" keywords as a search engine might (i.e. find all words of substance), you might split the document by spaces (\s), loop through your array ignoring all non-words and unsubstantial words, etc., and incrementing / adding the corresponding hash element (rating / # of occurances), then doing whatever is appropriate with this information.

Something like this should work if you are only searching local documents that don't have single words you might want to consider multiple words - otherwise, you could change \s to \W
foreach (split /\s+/, $document) {
unless (&badword) {
# a function to check if $_ is a common or "bad" word
if (exists $keywords{$_}) { $keywords{$_}++; }
else { $keywords{$_} = 1; }
}
}


--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>




Reply via email to