Hi,
I am trying to generate Language Model with RandLM tool, which uses Bloom
filter. I ran the tool with the commands
../randlm/bin/buildlm -struct BloomMap -falsepos 8 -values 8 -output-prefix
model < ./train2
../randlm/bin/querylm -randlm model.BloomMap -test-path train2 -test-type
corpus > scores
where the file train2 contains the tokenized lowercased corpus. The second
command produced the file scores, which contains the logs of the
probabilities of the ngrams.
However, this file (scores) does not contain the ngrams, So it's unclear to
what ngrams these probabilities relate. Could you please help - how can I
extract something like ARPA format from the files RandLM produces.
Thanks,
Michael.
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support