Hi all

I was wondering if we can write a script that
automatically takes the necessary steps for 
training. Any pointers on this would be appreciated.

Also i would like to know with any corpus
do symbols like full stop, question mark, comma,
opostrophe etc. play a significant role. I mean 
can these be included in the corpus and also
why do we have to lowercase everything.  

Another thing is that i know that the size of the 
corpus should be as big as possible but there should
be a threshold. This exponential increase should stop
somewhere where increasing the size wont improve the 
accuracy or will it coming improving ?

Thanks in advance.

Regards, Vineet




_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to