Hi Ergun,
Thank you for answer. However, for clarification, I am well aware how to 
do preprocessing (and it's still evil :) ), that's not my point.

I am just asking whether we have now some smarter tools around that help 
with the experimental setup and management, for instance for better 
dissemination. Every now and then we have new tools popping up, may it 
be duct tape, EMS or whatever is has been used before. I was wondering 
whether there is some new development in this regard that I might be 
unaware of and maybe people on the list have heard about new things and 
have had experience with that.

Readily preprocessed files is the opposite of what I need :)

W dniu 26.11.2017 o 13:27, Ergun Bicici pisze:
>
> Dear Marcin,
>
> I have uploaded my EMS files for WMT'16:
> https://github.com/bicici/ParFDAWMT16 
> <https://github.com/bicici/ParFDAWMT16>
>
> Text processing steps can be language-dependent, might require domain 
> knowledge and expertise, and distinct you from others elevating your 
> results.
> I suggest reading relevant sections from the papers of WMT 
> participants to get a feel of the computational requirements, that are 
> not necessarily
> made obvious, such as the use of unsupervised learning of classes in 
> language models and alignment. Text processing helps the datasets to take
> the form you like them to have even if you consider as evil. If 
> removing punctuation from some dataset helps, then this may be found 
> ingenuious as well.
>
> Barry Haddow has prepared preprocessed WMT'17 datasets:
> http://data.statmt.org/wmt17/translation-task/preprocessed/ 
> <http://data.statmt.org/wmt17/translation-task/preprocessed/>
> http://www.statmt.org/wmt17/translation-task.html 
> <http://www.statmt.org/wmt17/translation-task.html>
>
>
> Regards,
> Ergun
>
>
> On Sun, Nov 26, 2017 at 12:41 PM, Marcin Junczys-Dowmunt 
> <[email protected] <mailto:[email protected]>> wrote:
>
>     Hi list,
>
>     I am preparing a couple of usage example for my NMT toolkit and
>     got hung
>     up on all the preprocessing and other evil stuff. I am wondering is
>     there now anything decent around for doing preprocessing, running
>     experiments and evaluation? Or is the best thing still GNU make (isn't
>     that embarrassing)?
>
>     Best,
>
>     Marcin
>
>     _______________________________________________
>     Moses-support mailing list
>     [email protected] <mailto:[email protected]>
>     http://mailman.mit.edu/mailman/listinfo/moses-support
>     <http://mailman.mit.edu/mailman/listinfo/moses-support>
>
>
>
>
> -- 
>
> Regards,
> Ergun
>
>
>
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support



_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to