Re: ApacheCon Meetup

kellen sunderland Thu, 12 May 2016 18:08:46 -0700

I just wanted to discuss it as a group.  Your approach looks good to me.

On Thu, May 12, 2016 at 6:05 PM, Henry Saputra <[email protected]>
wrote:


> Ah sorry, trigger happy
>
> About logging. Are you proposing to use log4j interface in the code? I
> would recommend to use slf4j [1] as facade abstraction.
> Then implementation could be done via log4j or logback.
>
> Love to see API access to Joshua.
>
> - Henry
>
> [1] http://www.slf4j.org
>
> On Thu, May 12, 2016 at 6:03 PM, Henry Saputra <[email protected]>
> wrote:
>
> > About logging. Are you proposing to use log4j interface in the code? I
> > would recommend to use slf4j [1]
> >
> >
> > [
> >
> > On Thu, May 12, 2016 at 2:30 PM, kellen sunderland <
> > [email protected]> wrote:
> >
> >> Thanks for organizing Lewis,
> >>
> >> Here's some topics for discussion I've been noting while working with
> >> Joshua.  None of these are high priority issues for me, but if we are
> all
> >> in agreement on them it might make sense to log them.
> >>
> >> Boring code convention stuff: Logging with log4j, throw Runtime
> Exceptions
> >> instead of Typed, remove all system exits (replace with
> >> RuntimeExceptions),
> >> refactor some large files.
> >>
> >> Testing: Integrate existing unit tests, provide some good test examples
> so
> >> others can begin adding more tests.
> >>
> >> Configuration: We also touched on IoC, CLI args, and configuration
> changes
> >> that are possible.
> >>
> >> OO stuff: Joshua is pretty good here, but I would personally prefer more
> >> granular interfaces.  I wouldn't advocate radical changes, but maybe a
> >> little refactoring might make sense to better align with the interface
> >> segregation principle.
> >> https://en.wikipedia.org/wiki/Interface_segregation_principle
> >>
> >> JNI reliance:  We've found KenLM works really well with Joshua, but
> there
> >> is one issue with using it.  It requires many JNI calls during decoding
> >> and
> >> these calls impact GC performance.  In fact when a JNI call happens the
> GC
> >> throws out any work it may have done and quits until the JNI call
> >> completes.  The GC will then resume and start marking objects for
> >> collection from scratch.  This is not ideal especially for programs with
> >> large heaps (Joshua / Spark).  There's a couple ways we could mitigate
> >> this
> >> and I think they'd all speed up Joshua quite a lot.
> >>
> >> High level roadmap topics:
> >>
> >> *  Distributed Decoding is something I'll likely continue working on.
> >> Theres some obvious things we can do given usage patterns of translation
> >> engines that can help us out here (I think).
> >> *  Providing a way to optimize Joshua for low-latency, low-throughput
> >> calls
> >> could be interesting for those with near real-time use cases.
> Providing a
> >> way to optimize for high-latency, high-throughput could be interesting
> for
> >> async/batch use cases.
> >> *  The machine learning optimization algorithms could be cleaned up a
> bit
> >> (MERT/MIRA).
> >> *  The Vocabulary could probably be replaced with a simpler
> implementation
> >> (without sacrificing performance).
> >>
> >> -Kellen
> >>
> >>
> >>
> >> On Thu, May 12, 2016 at 12:32 PM, Lewis John Mcgibbney <
> >> [email protected]> wrote:
> >>
> >> > Hi Folks,
> >> > Kellen, Henri and I are going to get together tomorrow 13th around
> >> > lunchtime PST to talk everything Joshua.
> >> > Would be great to have others online via GChat if possible.
> >> > Let's say around 11am PST for the time being.
> >> > See you then folks.
> >> > Thanks
> >> > Lewis
> >> >
> >> >
> >> > --
> >> > *Lewis*
> >> >
> >>
> >
> >
>

Re: ApacheCon Meetup

Reply via email to