Hi Matt,

On Mon, Mar 14, 2016 at 8:26 AM, Matt Post <[email protected]> wrote:

> Whoa! Lewis, can you give some more detail on this talk, what you
> proposed, and what you plan to talk about?
>

http://sched.co/6OJI


>
> I haven't ever been to ApacheCon, but am interested in going. I don't have
> much of a feel for what motivates folks outside the academic research
> community, and that would be good to have in laying out projects that might
> interest people.
>

I agree. Would be great to meet you there. We could have a Joshua meetup.


>
> Regarding those project, I have a number of them. Perhaps it would be
> useful to flesh them out with some more detail, and perhaps post them, for
> those who are interested. First, with respect to Tommaso's question, the
> following:
>
> - Use cases. I'd really like to push machine translation as a black box,
> where people can download and use models, not caring how they work, and
> building on top of them. I think this could be transformative. I've just
> added to Joshua the ability to add, store, and manage custom phrasal
> translation rules, which would let people take a model and add their own
> translations on top of it, perhaps correcting mistakes as they encounter
> them. There's a JSON API for it (undocumented).
>
> Building this up would also require pulling together lots of different
> test sets, evaluating changes, and so on.
>
> - Neural nets. This is a huge research area. I think the advantages are
> that it could enable releasing models that are much smaller. However, on
> the down side, it's not clear what the best way to integrate these models
> into Joshua is. Fully neural attention models would require re-architecting
> Joshua, as they are essentially a new paradigm. Adding neural components as
> feature functions that interact with the existing decoding algorithm would
> be an intermediate step.
>

OK. This sounds like bang on for a meet up topic. Regardless of who is
there, we could have a Webex or something similar for the incubating
community,


>
> For other projects, I'd love:
>
> - Better documentation, developer and end-user (probably I need to write a
> lot of this; if nothing else, it would be hugely useful to me in terms of
> prioritizing to know that people want it)


> - Rewriting certain components. The tuning modules, in particular, are a
> real mess, and should be synthesized and improved.
>
> - Replacing Moses components. Joshua can call out to Moses to build phrase
> tables; it would be nice to get rid of this (and wouldn't be that hard)
> with our own Java implementations. It would also be good to add a
> lexicalized distortion model to the phrase-based decoder.
>
>
These all sound excellent and would all make very reasonable GSoC projects,
Thanks
Lewis

Reply via email to