Re: [CODE4LIB] lingua::stem::snowball [resolved]

2009-10-13 Thread Eric Lease Morgan
On Oct 12, 2009, at 10:27 PM, Benjamin Florin wrote: > foreach my $word (keys %words) > { > $words_stems{$stemmer->stem($word)} += $words{$word}; > } > > foreach my $idea (@ideas) > { > my $idea_stem = $stemmer->stem( $idea ); > print "$idea ($idea_stem)\n"; > print $words_

Re: [CODE4LIB] lingua::stem::snowball

2009-10-12 Thread Matt Jones
Presumably the call to stem() is the expensive part of your loop, so I'd want to cut that out if that is true. It looks to me that you can pass in an array reference to stem(), so there's no need for calling stem() in a loop at all. I'd think something like the code below should help reduce your

Re: [CODE4LIB] lingua::stem::snowball

2009-10-12 Thread Benjamin Florin
It's been a while since I perled, so this might not be the most idiomatic solution, but you could stem the entire words has list once and create a hash of all the sums (%words_stems), then run the list of idea words (@ideas), checking only the desired stems: use strict; use Lingua::Stem::Snowball;

[CODE4LIB] lingua::stem::snowball

2009-10-12 Thread Eric Lease Morgan
Can someone help me use Lingua::Stem::Snowball more efficiently? I want to count the total number of times a word stem appears in a hash. Here is a short example: use strict; use Lingua::Stem::Snowball; my $idea = 'books'; my %words = ( 'books'=> 5, 'library' => 6,