Dear Chris,
> The word I have written here is [...] masculine, and
> ends in "s." Not only that, but you change this word from plural to
> singular and from masculine to feminine, all by adding an "s" to it!
> [...]
> If the word in question is in /usr/share/dict/words, then it should be
> one of the (hopefully) rare words that is a -ss word that, when the last
> -s is dropped, is also in the larger -s list.
I happen to know the answer. So I looked at my copy of
/usr/dict/words .
It contains the /ss$/ feminine word, and the *singular*
masculine word. But the plural /[^s]s$/ plural masculine
word is not included.
This shows that simple grepping in /usr/dict/words will not
suffice.
Perhaps we can apply Lingua::EN::Inflect to every entry of
/usr/dict/words, but I don't know how comprehensive the
Lingua::EN::Inflect database is. (Here is hoping that Damian
isn't reading this insinuation. :-)
> Can anyone think of a clever way to do this ?
"clever" + "search" == "hash", in many cases. :-)
Assuming that the dictionary was comprehensive enough, and
sorted by ascending length (alphabetical doesn't matter), we
can use a hash.
|| % perl -wMstrict -lne 'use vars qq.%x.; (my $y = $_) =~ s/ss$/s/; \
|| print if $x{$y}; $x{$_}++;' /usr/dict/words
|| ass
|| buss
|| canvass
|| discuss
|| Douglass
|| hiss
|| %
For huge dictionaries, we can save on memory by throwing out
strings of length N-2 out of the hash, each time we hit a
higher word length N.
If the dictionary was not sorted by length, a hash would
still work, but we would need two passes -- one to load up
the hash, and the second to look up the truncated versions of
the words.
The Lingua::EN::Inflect approach would obviously not be quite
so straightforward.
peace, || Byatrayanapura: Better governance thru online taxes:
--{kr.pA} || http://tinyurl.com/296js
--
"If I have not seen farther, it is because giants have stood on my shoulders."
-- V. Guhan. [with apologies to Newton, Sir Isaac.]
_______________________________________________
Boston-pm mailing list
[EMAIL PROTECTED]
http://mail.pm.org/mailman/listinfo/boston-pm