Re: ApacheCon Meetup

Henry Saputra Thu, 12 May 2016 18:04:46 -0700

About logging. Are you proposing to use log4j interface in the code? I
would recommend to use slf4j [1]



[

On Thu, May 12, 2016 at 2:30 PM, kellen sunderland <
[email protected]> wrote:

> Thanks for organizing Lewis,
>
> Here's some topics for discussion I've been noting while working with
> Joshua.  None of these are high priority issues for me, but if we are all
> in agreement on them it might make sense to log them.
>
> Boring code convention stuff: Logging with log4j, throw Runtime Exceptions
> instead of Typed, remove all system exits (replace with RuntimeExceptions),
> refactor some large files.
>
> Testing: Integrate existing unit tests, provide some good test examples so
> others can begin adding more tests.
>
> Configuration: We also touched on IoC, CLI args, and configuration changes
> that are possible.
>
> OO stuff: Joshua is pretty good here, but I would personally prefer more
> granular interfaces.  I wouldn't advocate radical changes, but maybe a
> little refactoring might make sense to better align with the interface
> segregation principle.
> https://en.wikipedia.org/wiki/Interface_segregation_principle
>
> JNI reliance:  We've found KenLM works really well with Joshua, but there
> is one issue with using it.  It requires many JNI calls during decoding and
> these calls impact GC performance.  In fact when a JNI call happens the GC
> throws out any work it may have done and quits until the JNI call
> completes.  The GC will then resume and start marking objects for
> collection from scratch.  This is not ideal especially for programs with
> large heaps (Joshua / Spark).  There's a couple ways we could mitigate this
> and I think they'd all speed up Joshua quite a lot.
>
> High level roadmap topics:
>
> *  Distributed Decoding is something I'll likely continue working on.
> Theres some obvious things we can do given usage patterns of translation
> engines that can help us out here (I think).
> *  Providing a way to optimize Joshua for low-latency, low-throughput calls
> could be interesting for those with near real-time use cases.  Providing a
> way to optimize for high-latency, high-throughput could be interesting for
> async/batch use cases.
> *  The machine learning optimization algorithms could be cleaned up a bit
> (MERT/MIRA).
> *  The Vocabulary could probably be replaced with a simpler implementation
> (without sacrificing performance).
>
> -Kellen
>
>
>
> On Thu, May 12, 2016 at 12:32 PM, Lewis John Mcgibbney <
> [email protected]> wrote:
>
> > Hi Folks,
> > Kellen, Henri and I are going to get together tomorrow 13th around
> > lunchtime PST to talk everything Joshua.
> > Would be great to have others online via GChat if possible.
> > Let's say around 11am PST for the time being.
> > See you then folks.
> > Thanks
> > Lewis
> >
> >
> > --
> > *Lewis*
> >
>

Re: ApacheCon Meetup

Reply via email to