Hi, I'm playing with bitpar for parsing and annotating English content. I modified parse-de-bitpar.perl to use the TraceParser grammar files instead of the German Tiger files. When I tried to annotate my corpus parse-de-bitpar.perl died on me on two occasions:
1. a grammar like (a (b (c))(d)) does not get parsed correctly. parse-de-bitpar.perl chokes on the double (or multiple) closing brackets "c))" 2. quoted brackets are not parsed correctly. bitpar threw something like "\<\(xyz\)\>" at parse-de-bitpar.perl which rang it down. I can provide exact examples if anybody is interested. The patch below did it for me. Does anybody have experiences to share regarding syntax annotation? Is collins the way to go for English? best regards Christof diff -w local/bin/parse-bitpar.perl ~/libexec/moses-chart/bin/scripts/training/wrappers/parse-de-bitpar.perl 61c55 < my ($label,$rest) = split(/(?<!\\)[\)\( ]/,substr($line,$i+1)); --- > my ($label,$rest) = split(/[\( ]/,substr($line,$i+1)); _______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support