Folks,

I've updated the code with a few changes that will support Dockerized language 
packs. The nice thing is that this makes it easy to include KenLM.

Here are some changes that were made:

- Joshua now notes what directory the config file was found in and loads 
relative paths found in the config file relative to that directory 
automatically. This means you don't have to "cd" to the LP (language pack) 
directory before running Joshua.

- I fixed the HTTP server to take multiple "q=" lines, just like the Google 
translate API. Before, they only took one "q=" line. This should mean (I'll 
test later today) that the HTTP server can handle throughput essentially at the 
rates of the TCP server.

- I added (but haven't pushed yet) the KenLM model files to the language packs. 
In addition, I added a file "joshua.config.kenlm". These are not used except by 
Docker.

- I fixed the docker setup. See the new file:

        
https://github.com/apache/incubator-joshua/blob/master/distribution/docker/kenlm/Dockerfile
 
<https://github.com/apache/incubator-joshua/blob/master/distribution/docker/kenlm/Dockerfile>

This docker container builds KenLM. It then expects to be run with docker 
mounting an existing language pack to /model. It then runs the 
joshua.config.kenlm file, running it as a server in HTTP mode. See the README 
file for information:

        
https://github.com/apache/incubator-joshua/tree/master/distribution/docker/kenlm
 
<https://github.com/apache/incubator-joshua/tree/master/distribution/docker/kenlm>

If anyone wants to test this out, please do. You can grab an updated language 
pack (version 3) here:

        
http://cs.jhu.edu/~post/language-packs/apache-joshua-es-en-2017-03-03.tgz 
<http://cs.jhu.edu/~post/language-packs/apache-joshua-es-en-2017-03-03.tgz>

(Warning: 9 GB)

matt


> On Nov 23, 2016, at 10:14 AM, kellen sunderland <kellen.sunderl...@gmail.com> 
> wrote:
> 
> Yeah it should just be docker 'pull kellens/apache-joshua-es-en-2016-10-05'
> then 'docker run -it kellens/apache-joshua-es-en-2016-10-05 /bin/bash' or
> something similar.  I think the default command should eventually be to run
> the http server, so ideally we'd just do 'docker run -p 5674
> kellens/apache-joshua-es-en-2016-10-05' and that would start up the http
> server on port 5674.
> 
> Good point on Perl + Python, I can add them.
> 
> -Kellen
> 
> On Wed, Nov 23, 2016 at 3:22 PM, Matt Post <p...@cs.jhu.edu> wrote:
> 
>> Okay, I have this with
>> 
>>        docker run -it kellens/apache-joshua-es-en-2016-10-05 bash
>> 
>> It seems we are missing Perl (./prepare.sh fails), and we should replace
>> the LanguageModel line with a KenLM instance and build that. I bet we'll
>> need Python, too.
>> 
>> 
>> 
>> 
>>> On Nov 23, 2016, at 8:15 AM, Matt Post <p...@cs.jhu.edu> wrote:
>>> 
>>> Kellen, can I bother you to post a few first steps? I've successfully
>> pulled this down to my mac but now do not know how to find it, edit it, or
>> run it. I'm porting through the documentation and will find it eventually
>> but this would save me a bit of time.
>>> 
>>> 
>>>> On Nov 23, 2016, at 8:07 AM, kellen sunderland <
>> kellen.sunderl...@gmail.com> wrote:
>>>> 
>>>> Yes my next step was going to be getting it hosted officially.
>>>> 
>>>> I'll go ahead and open a ticket.  I think I'll hold off on pushing to
>> the
>>>> Apache account until I've done a little more testing though.
>>>> 
>>>> On Nov 23, 2016 5:22 AM, "lewis john mcgibbney" <lewi...@apache.org>
>> wrote:
>>>> 
>>>>> Hi Kellen,
>>>>> Nice :)
>>>>> Another option is for us to host these via the Apache account.
>>>>> https://hub.docker.com/r/apache/
>>>>> We could then add a badge to our README which points to the
>> Dockerfile(s).
>>>>> Do you want to open a ticket over on the INFRA Jira for this?
>>>>> 
>>>>> On Tue, Nov 22, 2016 at 1:57 PM, <
>>>>> dev-digest-h...@joshua.incubator.apache.org> wrote:
>>>>> 
>>>>>> From: kellen sunderland <kellen.sunderl...@gmail.com>
>>>>>> To: "dev@joshua.incubator.apache.org" <dev@joshua.incubator.apache.
>> org>
>>>>>> Cc:
>>>>>> Date: Tue, 22 Nov 2016 22:56:56 +0100
>>>>>> Subject: Re: Dockerhub hosted images
>>>>>> Ok, the first image should be properly uploaded now.
>>>>>> 
>>>>>> https://hub.docker.com/r/kellens/apache-joshua-es-en-2016-10-05/
>>>>>> 
>>>>>> -Kellen
>>>>>> 
>>>>>> 
>>>>> 
>>> 
>> 
>> 

Reply via email to