Re: modernmt

2017-07-02 Thread John Hewitt
I found the reference for that 1,000,000 number a bit too late -- according to this more recent paper from Koehn, it's more like 15,000,000 tokens for NMT to meet phrase-based MT, and they omit syntax-based. https://arxiv.org/pdf/1706.03872.pdf -John On Sun, Jul 2, 2017 at 12:38 PM, John Hewitt

Re: modernmt

2017-07-02 Thread John Hewitt
ink we can identify with the NMT danger. I still think there > > is a > > > > big niche that deep learning approaches won't reach for a few years, > > > until > > > > GPUs become super prevalent. Which is why I like ModernMT's > approaches, &

Re: [ANNOUNCE] - Apache Joshua 6.1 incubating release

2017-06-22 Thread John Hewitt
Related note: I've begun to announce to the Penn NLP communities; I can talk to Mark Liberman at the LDC about getting a note in there as well. -John On Thu, Jun 22, 2017 at 10:11 AM, lewis john mcgibbney wrote: > Hi Tommaso, > EXCELLENT :) > @Matt are you able to Tweet

Re: modernmt

2016-12-01 Thread John Hewitt
I had a few good conversations over dinner with this team at AMTA in Austin in October. They seem to be in the interesting position where their work is good, but is in danger of being superseded by neural MT as they come out of the gate. Clearly, it has benefits over NMT, and is easier to adopt,

Re: Any symal experts?

2016-11-23 Thread John Hewitt
, 2016 at 12:18 PM, Matt Post <p...@cs.jhu.edu> wrote: > John — I suggest trying to ditch those GIZA++ tools entirely. fast_align > indeed replaced them with "atools"; how much work would it be to port that? > > > > On Nov 23, 2016, at 12:11 PM, John Hewit

Any symal experts?

2016-11-23 Thread John Hewitt
Hey everyone, I'm packaging up a Java port Fast Align for Joshua and integrating it into the pipeline. Fast Align does not produce symmetrical alignments -- it relies on a tool that I haven't ported to Java. We package symal (which symmetricizes alignments) with Joshua right now for GIZA++, so

Re: Updating Incubator summary

2016-11-17 Thread John Hewitt
@Matt, that sounds like an interesting goal. What's the hook? @Henri, that sounds good. I like the idea of showing people snippets, as MT isn't necessarily intuitive to the average Linux.com reader. On Thu, Nov 17, 2016 at 5:44 AM, Matt Post wrote: > My thinking on that

Re: [VOTE] Release Apache Joshua (Incubating) 6.1

2016-11-14 Thread John Hewitt
+1 Let's do it. -John On Mon, Nov 14, 2016 at 1:13 PM, kellen sunderland < kellen.sunderl...@gmail.com> wrote: > +1 . Thanks to Lewis and Matt for all the recent work. > > On Nov 14, 2016 7:11 PM, "Matt Post" wrote: > > +1 > > Thanks for starting this off, Lewis! > > > > On

Re: Pipeline Mystery

2016-10-26 Thread John Hewitt
It seems like MERT isn't writing it's final config file (which is typical of MERT, in my experience). I recall giving up and using kbmira. This final config file is the one used in test, so I can see why skipping to test ends up failing pretty quick. To answer your question, though, I haven't

Re: openjdk 8 incompatibility

2016-10-25 Thread John Hewitt
and pushed, try again. > > > > On Oct 25, 2016, at 1:11 PM, John Hewitt <john...@seas.upenn.edu> wrote: > > > > Hi all, > > > > Has anyone been able to compile Joshua with openjdk? I get this message: > > > > /home/john/java/incubator-josh

openjdk 8 incompatibility

2016-10-25 Thread John Hewitt
Hi all, Has anyone been able to compile Joshua with openjdk? I get this message: /home/john/java/incubator-joshua/src/main/java/org/apache/joshua/decoder/ff/lm/KenLM.java:[21,19] error: package javafx.scene does not exist And the following link seems to confirm that javafx is not a part of

[jira] [Commented] (JOSHUA-288) Port fast_align to java

2016-10-25 Thread John Hewitt (JIRA)
[ https://issues.apache.org/jira/browse/JOSHUA-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15605667#comment-15605667 ] John Hewitt commented on JOSHUA-288: Replaced gnu-getopt (not Apache licence-compliant) with commons

[jira] [Updated] (JOSHUA-288) Port fast_align to java

2016-10-25 Thread John Hewitt (JIRA)
[ https://issues.apache.org/jira/browse/JOSHUA-288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Hewitt updated JOSHUA-288: --- Description: It would be great to have a Java port of fast_align, so that we don't have to worry

Re: language pack #1

2016-10-05 Thread John Hewitt
something we could do in the name of making it even clearer. (Potentially checking whether $JOSHUA is the same as $PWD after the directory change in prepare.sh, and printing a warning if it's not?) -John On Wed, Oct 5, 2016 at 11:32 PM, John Hewitt <john...@seas.upenn.edu> wrote: > Thanks, Matt!

[jira] [Commented] (JOSHUA-288) Port fast_align to java

2016-10-02 Thread John Hewitt (JIRA)
[ https://issues.apache.org/jira/browse/JOSHUA-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15541397#comment-15541397 ] John Hewitt commented on JOSHUA-288: I'm moving to benchmark the port against the original C

[jira] [Commented] (JOSHUA-288) Port fast_align to java

2016-09-08 Thread John Hewitt (JIRA)
[ https://issues.apache.org/jira/browse/JOSHUA-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15474276#comment-15474276 ] John Hewitt commented on JOSHUA-288: I've found what is possibly a bug in the original C code which

[jira] [Commented] (JOSHUA-221) ArrayIndexOutOfBoundsException when passing arguments to JoshuaDecoder.main

2016-08-19 Thread John Hewitt (JIRA)
[ https://issues.apache.org/jira/browse/JOSHUA-221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15428097#comment-15428097 ] John Hewitt commented on JOSHUA-221: The current command line parsing scheme writes the options

[jira] [Commented] (JOSHUA-288) Port fast_align to java

2016-08-13 Thread John Hewitt (JIRA)
[ https://issues.apache.org/jira/browse/JOSHUA-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15420046#comment-15420046 ] John Hewitt commented on JOSHUA-288: Existing direct port of fast_align to Java found: https

[GitHub] incubator-joshua issue #32: JOSHUA-286 - Replace old joshua-decoder.org link...

2016-07-28 Thread john-hewitt
Github user john-hewitt commented on the issue: https://github.com/apache/incubator-joshua/pull/32 @lewismc Improvements addressed. Happy to help. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] incubator-joshua pull request #32: JOSHUA-286 - Replace old joshua-decoder.o...

2016-07-27 Thread john-hewitt
GitHub user john-hewitt opened a pull request: https://github.com/apache/incubator-joshua/pull/32 JOSHUA-286 - Replace old joshua-decoder.org links with joshua.apache.org - Update links to documentation and support to reflect the move to Apache. - keep Gitignore entry