Hi Judah
The tokeniser also escapes some characters which have special meaning
for Moses, and at decoding time the most important of these is the pipe
(|). A stray pipe probably caused Moses to fail for you, but URLs
shouldn't contain pipes.
cheers - Barry
On 15/07/14 13:59, Judah Schvimer wrote:
HI,
Thank you very much! That's incredibly helpful. My one concern is that
before I tokenized the input to the decoder it was crashing. Do you
know what tokens would cause that behavior if left in? Would you
recommend just not tokenizing path names and urls and leaving
everything else?
Judah
On Tue, Jul 15, 2014 at 4:02 AM, Barry Haddow
<[email protected] <mailto:[email protected]>> wrote:
Hi Judah
The actual problem here is that you do not want path names split
by the tokeniser. It's only really set up to deal with regular
text, but what you can do is ask it to "protect" certain patterns
by using the
-protected <filename>
argument. The file <filename> should contain a list of regular
expressions (one per line), and the tokeniser will not split apart
any tokens which match these REs. I'm guessing that in the example
below you don't want "tutorial" translated into the target
language, and if the tokeniser doesn't split the path then the
whole thing will pass through as an OOV,
cheers - Barry
On 14/07/14 16:53, Judah Schvimer wrote:
Hi,
When I'm using the decoder I have to tokenize my target
sentences before I translate them. However, when I detokenize
them it leaves awkward spaces around what was tokenized. is
there any way to fix this? It seems to be mainly around
slashes and colons
Source: :doc:`/tutorial/aggregation-zip-code-data-set`
Target: : Doc: '/ tutorial / aggregation-zip-code-data-set'
Thanks,
Judah
_______________________________________________
Moses-support mailing list
[email protected] <mailto:[email protected]>
http://mailman.mit.edu/mailman/listinfo/moses-support
--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support