I found the reference for that 1,000,000 number a bit too late -- according
to this more recent paper from Koehn, it's more like 15,000,000 tokens for
NMT to meet phrase-based MT, and they omit syntax-based.
https://arxiv.org/pdf/1706.03872.pdf
-John
On Sun, Jul 2, 2017 at 12:38 PM, John Hewitt
ink we can identify with the NMT danger. I still think there
> > is a
> > > > big niche that deep learning approaches won't reach for a few years,
> > > until
> > > > GPUs become super prevalent. Which is why I like ModernMT's
> approaches,
&
Related note: I've begun to announce to the Penn NLP communities; I can
talk to Mark Liberman at the LDC about getting a note in there as well.
-John
On Thu, Jun 22, 2017 at 10:11 AM, lewis john mcgibbney
wrote:
> Hi Tommaso,
> EXCELLENT :)
> @Matt are you able to Tweet
I had a few good conversations over dinner with this team at AMTA in Austin
in October.
They seem to be in the interesting position where their work is good, but
is in danger of being superseded by neural MT as they come out of the gate.
Clearly, it has benefits over NMT, and is easier to adopt,
, 2016 at 12:18 PM, Matt Post <p...@cs.jhu.edu> wrote:
> John — I suggest trying to ditch those GIZA++ tools entirely. fast_align
> indeed replaced them with "atools"; how much work would it be to port that?
>
>
> > On Nov 23, 2016, at 12:11 PM, John Hewit
Hey everyone,
I'm packaging up a Java port Fast Align for Joshua and integrating it into
the pipeline.
Fast Align does not produce symmetrical alignments -- it relies on a tool
that I haven't ported to Java.
We package symal (which symmetricizes alignments) with Joshua right now for
GIZA++, so
@Matt, that sounds like an interesting goal. What's the hook?
@Henri, that sounds good. I like the idea of showing people snippets, as MT
isn't necessarily intuitive to the average Linux.com reader.
On Thu, Nov 17, 2016 at 5:44 AM, Matt Post wrote:
> My thinking on that
+1 Let's do it.
-John
On Mon, Nov 14, 2016 at 1:13 PM, kellen sunderland <
kellen.sunderl...@gmail.com> wrote:
> +1 . Thanks to Lewis and Matt for all the recent work.
>
> On Nov 14, 2016 7:11 PM, "Matt Post" wrote:
>
> +1
>
> Thanks for starting this off, Lewis!
>
>
> > On
It seems like MERT isn't writing it's final config file (which is typical
of MERT, in my experience). I recall giving up and using kbmira. This final
config file is the one used in test, so I can see why skipping to test ends
up failing pretty quick.
To answer your question, though, I haven't
and pushed, try again.
>
>
> > On Oct 25, 2016, at 1:11 PM, John Hewitt <john...@seas.upenn.edu> wrote:
> >
> > Hi all,
> >
> > Has anyone been able to compile Joshua with openjdk? I get this message:
> >
> > /home/john/java/incubator-josh
Hi all,
Has anyone been able to compile Joshua with openjdk? I get this message:
/home/john/java/incubator-joshua/src/main/java/org/apache/joshua/decoder/ff/lm/KenLM.java:[21,19]
error: package javafx.scene does not exist
And the following link seems to confirm that javafx is not a part of
[
https://issues.apache.org/jira/browse/JOSHUA-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15605667#comment-15605667
]
John Hewitt commented on JOSHUA-288:
Replaced gnu-getopt (not Apache licence-compliant) with commons
[
https://issues.apache.org/jira/browse/JOSHUA-288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
John Hewitt updated JOSHUA-288:
---
Description:
It would be great to have a Java port of fast_align, so that we don't have to
worry
something we could do in the name of making it even
clearer. (Potentially checking whether $JOSHUA is the same as $PWD after
the directory change in prepare.sh, and printing a warning if it's not?)
-John
On Wed, Oct 5, 2016 at 11:32 PM, John Hewitt <john...@seas.upenn.edu> wrote:
> Thanks, Matt!
[
https://issues.apache.org/jira/browse/JOSHUA-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15541397#comment-15541397
]
John Hewitt commented on JOSHUA-288:
I'm moving to benchmark the port against the original C
[
https://issues.apache.org/jira/browse/JOSHUA-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15474276#comment-15474276
]
John Hewitt commented on JOSHUA-288:
I've found what is possibly a bug in the original C code which
[
https://issues.apache.org/jira/browse/JOSHUA-221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15428097#comment-15428097
]
John Hewitt commented on JOSHUA-221:
The current command line parsing scheme writes the options
[
https://issues.apache.org/jira/browse/JOSHUA-288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15420046#comment-15420046
]
John Hewitt commented on JOSHUA-288:
Existing direct port of fast_align to Java found:
https
Github user john-hewitt commented on the issue:
https://github.com/apache/incubator-joshua/pull/32
@lewismc Improvements addressed. Happy to help.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
GitHub user john-hewitt opened a pull request:
https://github.com/apache/incubator-joshua/pull/32
JOSHUA-286 - Replace old joshua-decoder.org links with joshua.apache.org
- Update links to documentation and support to reflect the
move to Apache.
- keep Gitignore entry
20 matches
Mail list logo