lmplz is going to split on whitespace as documented under corpus
formatting notes: https://kheafield.com/code/kenlm/estimation/

If you want a language model over parts of speech, give it space
separated parts of speech.  This script may be of use to you in
discarding the words:

scripts/generic/extract-factors.pl /path/to/file 1

On 10/29/2017 01:28 AM, Aileen Joan Vicente wrote:
> Does this mean that if I ran lmplz on a tagged corpus (with this format:
> surface form|POS), the program will automatically generate Part of
> Speech Language Model?
> 
> Or ran lmplz on a parallel corpus of POS tags (parallel to sentences
> where the tags were generated)?
> 
> Thank you for your reply.
> 
> 
> 
> On Sat, Oct 28, 2017 at 10:48 PM, Kenneth Heafield <[email protected]
> <mailto:[email protected]>> wrote:
> 
>     Hi,
> 
>             You convert the words to part of speech using an external
>     tagger (lmplz
>     does not include POS detection).  Then you'll probably need to run lmplz
>     --discount_fallback because the vocabulary is small.
> 
>     Kenneth
> 
>     On 10/28/2017 02:06 AM, Aileen Joan Vicente wrote:
>     > Hi! I am learning Factored Training and the tutorial suggests
>     building a
>     > part-of-speech language model. I have already tried building one
>     on and
>     > english training sentences and I wonder if there is an option in lmplz
>     > to direct the program to look at the sentence's pos tags. I've been
>     > googling for two days and I haven't found the answer yet.
>     >
>     > Thank you for your response.
>     >
>     > Best,
>     >
>     > Aileen Joan Vicente
>     > UP Cebu Philippines
>     >
>     >
>     > _______________________________________________
>     > Moses-support mailing list
>     > [email protected] <mailto:[email protected]>
>     > http://mailman.mit.edu/mailman/listinfo/moses-support
>     <http://mailman.mit.edu/mailman/listinfo/moses-support>
>     >
>     _______________________________________________
>     Moses-support mailing list
>     [email protected] <mailto:[email protected]>
>     http://mailman.mit.edu/mailman/listinfo/moses-support
>     <http://mailman.mit.edu/mailman/listinfo/moses-support>
> 
> 
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to