Hi everyone. Sorry I joined the conversation a bit late.
Here are all the options for Moses on Windows that I've tried:
*Virtual machine* - VMs are more efficient than they used to be. Any
good hypervisor will work. You need to build your own tools to create a
sockets layer to communicate from the host machine to the VM.
*Moses with Cygwin* - I never got it to work. There are too many
variables and the underlying open source components change drastically.
It could be one person's full-time job to track keep Moses updated for
those build tools.
*Moses with Visual Studio* - It's a good option if you're looking for
only the moses.exe runtime binary. This does not address the serious
challenges of updating all the other C++, Perl and Python components to
run on Windows. Nor does it address collecting and installing all the
GNU utilities.
*Windows Subsystem for Linux (WSL)* - WSL is native Linux running on the
Windows. Microsoft has done a great job of mapping the Linux kernel
calls to the Windows kernel. There's no virtual machine. It's very much
like running Windows applications on Linux using WINE... but without the
complexities of re-mapping Microsoft's GUI APIs.
From Moses' perspective, WSL is nothing special. It sees the
environment as any native Linux. You can compile Linux binaries on Linux
and copy them to your Windows 10 machine and run them with WSL. You will
need to verify/install all of the other dependencies because WSL
distributions are sparse at best. For example, the Ubuntu 16.04 distro
for WSL does not install Python 2.7. The `python` command will fail. It
installs Python 3.5 by default with `python3` as the executable name.
This will cause Moses to fail. You'll need to do a lot of troubleshooting.
WSL also falls short when it comes to inter-process communications
between WSL's native Linux apps and native Windows apps. I found one
Microsoft instruction that WSL and Windows apps share the network
sockets stack, and that is the only inter-process communication between
the two layers. So, STDIN/STDOUT/STDERR are not available between native
Windows apps and Linux apps on WSL. This means from a practical
standpoint, you're back to working with WSL as though it's a VM. That
is, you can build a sockets layer to communicate between the two. Maybe
you can use the moses binary in Moses Server mode and use XML-RPC, but I
never tried it.
There's another awkward inter-process strategy. If you're in a Windows
cmd.exe terminal or PowerShell terminal, you can execute a command line
that will launch a Linux app in the WSL space. (and vice-versa. From a
native Linux Bash terminal, you can launch a native Windows .exe, but
that's irrelevant to Moses) So, you could have your Windows app save a
text file to the file system, then execute the Moses Linux binary with
the -i argument pointing to the input file. The challenge here is
translating the Windows file system "C:\Users\...." to the WSL file
system "/mnt/c/Users/...". You'll also have to figure out a way to pass
the STDOUT redirection to the moses binary to save it's output. I don't
know if this is possible. Then, your Windows application that spawned
the Linux Moses binary will need to detect when the binary is done, and
then retrieve the output file. WARNING: do not directly access the Linux
file system tree from Windows
[%LOCALAPPDATA%\Packages\CanonicalGroupLimited.UbuntuonWindows_xxxxxxxxxxxxx\LocalState\rootfs].
You are virtually guaranteed to corrupt the files. Only have Windows
apps save to it's native C:\ drives and then access them from Linux
using the mount director tree.
Bottom line... you should read what Microsoft's engineers have written
about why they created WSL. It is not intended to be a production tool.
It is designed as a development environment that facilitates developing
cross-platform applications/tools. Microsoft is still putting great
development effort into WSL, it will likely evolve to become a much
better production environment. WSL also has some funky quirks relating
to path lengths and permissions, but they were edge case for me and
you'll probably never encounter them.
*Slate Toolkit* - I have a free Windows and Linux package distributions
that you can download from my website
(https://slate.rocks/slate-toolkit-edition/). I won't repeat the README
here. In brief, the toolkit packages includes all of the the MGIZA++
tools and the Moses tools for each platform as described in the README.
The Windows package includes the required native Windows GNU tools.
These are the GIT commits used to compile the binaries from Moses github
trunk.
MGIZA_COMMIT = d643960de98565d208114780ba8025799208afa7
MOSES_COMMIT_LINUX = 3a0631a05b7f53a7f387ca8ddca432f5ddb22029 (Jan
2018?)
MOSES_COMMIT_WINDOWS = a64468a9919867174beba71962ada795a4ce12d3
(pre-merge of Moses server)
The Linux binaries were compiled using Moses' BJAM. The Windows binaries
were compiled using my proprietary build environment (Visual Studio or
Cygwin). Both Linux and Windows were compiled with these options enabled:
* 64-bit architecture only
* static compiled -- no .dll/.so dependencies
* KenLM
* XML-RPC
* legacy binarized translation and reordering tables
The packages include Perl and Python scripts from MGIZA++ and Moses that
I updated for cross-platform use. This means these packages are
full-stack phrase-based SMT that create SMT models and decode with them.
Models trained/tuned on Windows can be copied to Linux for decoding and
vice versa.
The files in the packages are laid out so that you unzip the package,
open a command line and execute the demo shell scripts. There is a
sample data set so you can confirm everything works. There are demo
shell scripts (Bash for Linux, .CMD batch files for Windows) that
execute complete toolchains to build a baseline SMT model. They are not
suitable for production, but they should be a good learning environment
to compare Linux vs Windows usage. Standard pipes, sockets, file
system... they're all available for inter-process communications between
the Moses binaries and other Windows apps.
Licensing: All of the components, including the binaries, are
distributed under their respective open source licenses. For the most
part, that's LGPL-3 for Moses. The package has a complete list of open
source licenses per component. This means you can freely re-distribute
the binaries and scripts within the terms of the respective open source
licenses. I.e., I don't make any money on them.
*Slate Desktop* - Lastly, I'll mention my our flagship application. For
the SMT component, Slate Desktop uses Slate Toolkit. It includes many
many more tools for corpus preparation and integration with popular CAT
tools. We distribute Slate Desktop under a proprietary EULA. License
fees are based per-host. There are no subscription fees or royalties.
Slate Desktop's underlying architecture includes a primitive API. You
could, for example, license our API and build your application around
it. This means you could have your application running and tested in a
few days. If this interests you, please contact me off-list.
I hope this helps everyone and puts the Windows support in perspective.
Regards,
Tom
On 5/9/2018 11:58 PM, [email protected] wrote:
Date: Wed, 9 May 2018 18:52:48 +0200
From: Mohamed Amine MENACER<[email protected]>
Subject: Re: [Moses-support] Install moses on Windows
To: Nisheeth Joshi<[email protected]>,[email protected]
Cc:[email protected]
Hi Nisheeth,
Great!! I will try it out right away.
I also think that it is a good idea to put this solution on the official
website.
Warm regards.
Amine.
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support