Hi,

I'm playing with bitpar for parsing and annotating English content. I 
modified parse-de-bitpar.perl to use the TraceParser grammar files 
instead of the German Tiger files. When I tried to annotate my corpus 
parse-de-bitpar.perl died on me on two occasions:

1. a grammar like (a (b (c))(d))  does not get parsed correctly. 
parse-de-bitpar.perl chokes on the double (or multiple) closing brackets 
"c))"

2. quoted brackets are not parsed correctly. bitpar threw something like 
"\<\(xyz\)\>" at parse-de-bitpar.perl which rang it down. I can provide 
exact examples if anybody is interested.

The patch below did it for me.

Does anybody have experiences to share regarding syntax annotation? Is 
collins the way to go for English?

best regards
Christof




diff -w  local/bin/parse-bitpar.perl 
~/libexec/moses-chart/bin/scripts/training/wrappers/parse-de-bitpar.perl
61c55
<  my ($label,$rest) = split(/(?<!\\)[\)\( ]/,substr($line,$i+1));
---
 >  my ($label,$rest) = split(/[\( ]/,substr($line,$i+1));
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to