The Johnson et al. technique is not used by default.  You have to
build sigtest-filter (which requires a suffix array toolkit from CMU,
instructions are in the directory).  The parameters are set on the
command line.  Running the binary with no parameters will give
instructions.  The standard values that Johnson suggests are available
by default.

Chris

On Wed, Jul 23, 2008 at 5:52 PM, marco turchi <[EMAIL PROTECTED]> wrote:
> Hi
> thanks a lot for the answers...
> Sorry my question was not clear. I was wondering if Giza or later in the
> Moses pipeline, the software that creates the phrase, do not consider those
> alignments that are not supported by enough data in the training set.
>
> If I have understood in the right way the answer is yes, the (Johnson et al.
> 2007) techniques.
>
> just one more question (maybe two :-) ):
> is the (Johnson et al. 2007) a default technique in Moses?
> how is the threshold set?
>
> thanks a lot
> Marco
>
>
> On Wed, Jul 23, 2008 at 6:35 PM, Chris Dyer <[EMAIL PROTECTED]> wrote:
>>
>> > There are also various post hoc approaches to removing noise from
>> > phrases tables and alignments.  Some recent examples:
>> > http://aclweb.org/anthology-new/D/D07/D07-1103.pdf
>> > http://aclweb.org/anthology-new/W/W08/W08-0306.pdf
>> >
>> > Although there's nothing like this included in Moses, it would be easy
>> > to contribute one as a standalone script.
>> Actually, the first paper above (Johnson et al. 2007) is implemented
>> and included in moses.  I haven't found that it ever improves the
>> translation performance, but it can significantly reduce the size of
>> your phrase table.
>>
>> Chris
>> _______________________________________________
>> Moses-support mailing list
>> [email protected]
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to