Dear Chris,

>     The word I have written here is [...] masculine, and
>     ends in "s." Not only that, but you change this word from plural to
>     singular and from masculine to feminine, all by adding an "s" to it!
> [...]
> If the word in question is in /usr/share/dict/words, then it should be 
> one of the (hopefully) rare words that is a -ss word that, when the last 
> -s is dropped, is also in the larger -s list.

I happen to know the answer.  So I looked at my copy of
/usr/dict/words .

It contains the /ss$/ feminine word, and the *singular*
masculine word.  But the plural /[^s]s$/ plural masculine
word is not included.

This shows that simple grepping in /usr/dict/words will not
suffice.

Perhaps we can apply Lingua::EN::Inflect to every entry of
/usr/dict/words, but I don't know how comprehensive the
Lingua::EN::Inflect database is.  (Here is hoping that Damian
isn't reading this insinuation.  :-)


> Can anyone think of a clever way to do this ?

"clever" + "search" == "hash", in many cases.  :-)

Assuming that the dictionary was comprehensive enough, and
sorted by ascending length (alphabetical doesn't matter), we
can use a hash.

  || % perl -wMstrict -lne 'use vars qq.%x.; (my $y = $_) =~ s/ss$/s/; \
  ||       print if $x{$y}; $x{$_}++;' /usr/dict/words
  || ass
  || buss
  || canvass
  || discuss
  || Douglass
  || hiss
  || %

For huge dictionaries, we can save on memory by throwing out
strings of length N-2 out of the hash, each time we hit a
higher word length N.

If the dictionary was not sorted by length, a hash would
still work, but we would need two passes -- one to load up
the hash, and the second to look up the truncated versions of
the words.

The Lingua::EN::Inflect approach would obviously not be quite
so straightforward.

peace,          || Byatrayanapura: Better governance thru online taxes:
--{kr.pA}       || http://tinyurl.com/296js
-- 
"If I have not seen farther, it is because giants have stood on my shoulders."
    -- V. Guhan.  [with apologies to Newton, Sir Isaac.]
_______________________________________________
Boston-pm mailing list
[EMAIL PROTECTED]
http://mail.pm.org/mailman/listinfo/boston-pm

Reply via email to