On 11/28/07, Grzegorz Chrupala <[EMAIL PROTECTED]> wrote: > > You may have better luck checking out methods used in parsing natural > language. In order to use statistical parsing techniques such as > Probabilistic Context Free Grammars ([1],[2] ) the standard approach is to > extract rule probabilities from an annotated corpus, that is collection of > strings with associated parse trees. Maybe you could use your 2/3 of > addresses that you know are correctly parsed as your training material. > > A PCFG parser can output all (or n-best) parses ordered according to > probabilities so that would seem to be fit your requirements. > [1] http://en.wikipedia.org/wiki/Stochastic_context-free_grammar > [2] http://www.cs.colorado.edu/~martin/slp2.html#Chapter14 > -- > Best, > Grzegorz > --
Hi Grzegorz, Wow, Natural Language Processing looks quite complex! But it also seems to be closely related to my problem. If someone finds a "NPL for dummies" article or book I'm interested. ;-) Thanks for your help, Olivier.
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe