Re: [Moses-support] Sparse features and overfitting

HOANG Cong Duy Vu Thu, 15 Jan 2015 15:50:08 -0800

Thanks for your replies!

Hi Prashant,

there is definitely an option for sparse l1/l2 regularization with mira. I
> don't know how to call it through command line though.

Yes. For MIRA, we can set the *C* parameter to control its regularization.
I tried different C values (0.01, 0.001) but it didn't work in my case.

Hi Matthias,

Do the sparse features give you any large improvement on the tuning set?

Yes. The improvement is around ~2-3 BLEU scores on the tuning set.

Does this mean that there are hundreds of sentences in your original
> tuning and test sets that are equal on the source side but have
> different references? That sounds a bit odd. Maybe it indicates that
> something about your data is generally problematic.

Yes. It's quite odd, I think so. But this data (Chinese-to-English) is
extracted from an official competition.
Probably, I will have to remove overlapping before moving on with other
kinds of features.

--
Cheers,
Vu

On Fri, Jan 16, 2015 at 6:31 AM, Matthias Huck <[email protected]> wrote:

> On Thu, 2015-01-15 at 13:54 +0800, HOANG Cong Duy Vu wrote:
>
>
> > - tune & test
> > (based on source)
> > size of overlap set = 624
> > (based on target)
> > size of overlap set = 386
>
> >
> > (tune & test have high overlapping parts based on source sentences,
> > but half of them have different target sentences)
>
>
>
> Does this mean that there are hundreds of sentences in your original
> tuning and test sets that are equal on the source side but have
> different references? That sounds a bit odd. Maybe it indicates that
> something about your data is generally problematic.
>
>
>
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
>
>

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] Sparse features and overfitting

Reply via email to