I am not sending text to moses from a text file, I am using the command-line:
m.bat contains: echo %1 | c:\\cygwin\\path\\to\\moses.exe -f c:\\cygwin\\path\\to\\moses.ini 2> msc_tywyddTeletestun.err usage > m.bat "bydd y bore 'n oer ." "bydd the morning will be cold ." Philipp Koehn wrote: > Hi, > > I have seen text files under windows that add a starting byte to > indicate the encoding of the file. Sine the first word is a problem, > this may be the cause. > > -phi > > On Thu, Nov 26, 2009 at 2:34 PM, Ivan Uemlianin > <[email protected]> wrote: >> Hieu >> >> Thanks for your comment. >> >> How can this be a line-ending issue? Where are line-endings involved? >> >> What is appending an extra character to the first word and why? >> >> The 1st input word *is* being recognised and translated (as I said, the >> translations under dos are correct) --- "bydd" translates to "will be". >> >> I'm using identical material under cygwin and dos, the only difference >> is under cygwin I'm using a shell script and under dos I'm using a >> ".bat" file. If it is a line-ending issue why is it affecting dos and >> not cygwin? >> >> >> Hieu Hoang wrote: >>> hi ivan >>> >>> i think this might be a problem with line ending again. The non-printing >>> 0x13 character is being appended to the 1st input word which causes it >>> to be unrecognised so it is outputted ad-verbatim. Cygwin properly has >>> internal code which strips out this character >>> >>> make sure you convert all text files to unix line endings using >>> dos2unix >>> >>> Ivan Uemlianin wrote: >>>> Dear All >>>> >>>> Running the moses decoder on cygwin and dos gives slightly different >>>> results, even though I'm using the same executable and the same models. >>>> >>>> For example, translating from Welsh to English: >>>> >>>> Welsh: bydd y bore 'n oer . >>>> English: the morning will be cold . >>>> >>>> mo...@cygwin: morning will be cold . >>>> mo...@dos: bydd the morning will be cold . >>>> >>>> The main problem is that on dos, moses is always returning the first >>>> word of the source language, prepended to the translation itself. >>>> Easy to strip off but annoying. The translation itself is often >>>> slightly better on dos than on cygwin, as above (which is if anything >>>> even stranger). >>>> >>>> Can anyone account for this strange behaviour? More important, how >>>> can I stop the first word of source language returning? >>>> >>>> Thanks and best wishes >>>> >>>> Ivan >>>> >>>> >>>> >> >> -- >> ******************************** >> Ivan Uemlianin >> >> Canolfan Bedwyr >> Safle'r Normal Site >> Prifysgol Bangor University >> BANGOR >> Gwynedd >> LL57 2PZ >> >> [email protected] >> ******************************** >> _______________________________________________ >> Moses-support mailing list >> [email protected] >> http://mailman.mit.edu/mailman/listinfo/moses-support >> -- ******************************** Ivan Uemlianin Canolfan Bedwyr Safle'r Normal Site Prifysgol Bangor University BANGOR Gwynedd LL57 2PZ [email protected] ******************************** _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
