Hi all,

I’m having this really weird Unicode issue when using compact phrase tables 
that could be related to endianness somehow, but I’ve no idea how.
I compiled the training tools from v3 on my Mac and built a few models using 
compact phrase (and reordering) tables and KenLM, including (for simplicity) a 
recasing model for DE (download it from https://autodesk.box.com/DE-Recaser 
<https://autodesk.box.com/DE-Recaser>). Things become strange when I try to use 
the models, though:
1. All works fine when I use the decoder binary I compiled myself on the Mac 
(10.10.2, self-built Boost 1.57)
2. Unicode input is not recognised when I use the binary from 
http://www.statmt.org/moses/RELEASE-3.0/binaries/macosx-yosemite/ 
<http://www.statmt.org/moses/RELEASE-3.0/binaries/macosx-yosemite/> i.e. words 
like ‘für’ or ‘ausführlich’ are marked as UNK.
3. Unicode input is not recognised when I use a binary I compiled myself on 
Ubuntu 12.04.5 (self-built Boost 1.57)
4. All  works fine when I use the binary from 
http://www.statmt.org/moses/RELEASE-3.0/binaries/linux-64bit/ 
<http://www.statmt.org/moses/RELEASE-3.0/binaries/linux-64bit/> 

I tested the above with the queryPhraseTableMin tool (rather than the decoder) 
and got the same results, which is what makes me think this could be somehow 
related to binary incompatibility with the way the phrase table is compacted. 
Haven’t investigated deeper than that, though.


Any clues?
One would say, just use the Linux binary then on Linux... However, I have a 
number of CentOS/RHEL 5 and 6 boxes, where the pre-compiled binary doesn’t 
work, as the system glibc is too old. So there I need to compile Moses myself, 
but then Unicode isn’t recognised...



Cheers,

Ventzi

–––––––
Dr. Ventsislav Zhechev
Computational Linguist, Certified ScrumMaster®
Platform Architecture and Technologies
Localisation Services

MAIN +41 32 723 91 22
FAX +41 32 723 93 99

http://VentsislavZhechev.eu <http://ventsislavzhechev.eu/>

Autodesk, Inc.
Rue de Puits-Godet 6
2000 Neuchâtel, Switzerland
www.autodesk.com <http://www.autodesk.com/>



_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to