Re: [DISCUSS] - Graduation of Apache Joshua

2018-09-20 Thread Henry Saputra
gt; > >
>
> > >
>
> > > NOW, THEREFORE, BE IT RESOLVED, that a Project Management
>
> > >
>
> > > Committee (PMC), to be known as the "Apache Joshua Project",
>
> > >
>
> > > be and hereby is established pursuant to Bylaws of the
>
> > >
>
> > > Foundation; and be it further
>
> > >
>
> > >
>
> > >
>
> > > RESOLVED, that the Apache Joshua Project be and hereby is
>
> > >
>
> > > responsible for the creation and maintenance of software
>
> > >
>
> > > related to statistical and other forms of machine translation;
>
> > >
>
> > > and be it further
>
> > >
>
> > >
>
> > >
>
> > > RESOLVED, that the office of "Vice President, Apache Joshua" be
>
> > >
>
> > > and hereby is created, the person holding such office to
>
> > >
>
> > > serve at the direction of the Board of Directors as the chair
>
> > >
>
> > > of the Apache Joshua Project, and to have primary responsibility
>
> > >
>
> > > for management of the projects within the scope of
>
> > >
>
> > > responsibility of the Apache Joshua Project; and be it further
>
> > >
>
> > >
>
> > >
>
> > > RESOLVED, that the persons listed immediately below be and
>
> > >
>
> > > hereby are appointed to serve as the initial members of the
>
> > >
>
> > > Apache Joshua Project:
>
> > >
>
> > >
>
> > >
>
> > > * Tom Barber  
> >
>
> > >
>
> > > * Thamme Gowda   
>
> > >
>
> > > * Felix Hieber 
>
> > >
>
> > > * Lewis John McGibbney 
>
> > >
>
> > > * Chris Mattmann 
>
> > >
>
> > > * Matt Post 
>
> > >
>
> > > * Paul Ramirez   
>
> > >
>
> > > * Henry Saputra
>
> > >
>
> > > * Kellen Sunderland 
>
> > >
>
> > > * Tommaso Teofili
>
> > >
>
> > >
>
> > >
>
> > > NOW, THEREFORE, BE IT FURTHER RESOLVED, that Tommaso Teofili
>
> > >
>
> > > be appointed to the office of Vice President, Apache Joshua to
>
> > >
>
> > > serve in accordance with and subject to the direction of the
>
> > >
>
> > > Board of Directors and the Bylaws of the Foundation until
>
> > >
>
> > > death, resignation, retirement, removal or disqualification,
>
> > >
>
> > > or until a successor is appointed; and be it further
>
> > >
>
> > >
>
> > >
>
> > > RESOLVED, that the initial Apache Joshua PMC be and hereby is
>
> > >
>
> > > tasked with the creation of a set of bylaws intended to
>
> > >
>
> > > encourage open development and increased participation in the
>
> > >
>
> > > Apache Joshua Project; and be it further
>
> > >
>
> > >
>
> > >
>
> > > RESOLVED, that the Apache Joshua Project be and hereby
>
> > >
>
> > > is tasked with the migration and rationalization of the Apache
>
> > >
>
> > > Incubator Joshua podling; and be it further
>
> > >
>
> > >
>
> > >
>
> > > RESOLVED, that all responsibilities pertaining to the Apache
>
> > >
>
> > > Incubator Joshua podling encumbered upon the Apache Incubator
>
> > >
>
> > > Project are hereafter discharged.
>
> > >
>
> >
>
>
>
>
>
>


Re: Thumbs up from general@ to release Joshua 6.1 (Incubating)

2017-06-16 Thread Henry Saputra
Congrats, guys.

Thanks for all the work managing this release, Tommaso!

- Henry

On Mon, Jun 12, 2017 at 10:54 AM, Tommaso Teofili  wrote:

> sorry for the delay guys, I'm on a conference week but I should be able to
> proceed later today.
>
> Regards,
> Tommaso
>
> p.s.:
> here's the slide deck from a presentation me a friend of mine gave on multi
> language search (with Apache Joshua):
> https://smarthi.github.io/bbuzz17-embracing-diversity-
> searching-over-multiple-languages
>
>
>
> Il giorno lun 12 giu 2017 alle ore 17:34 Matt Post  ha
> scritto:
>
> > Thanks for all your work on this, folks!
> >
> > matt
> >
> >
> > > On Jun 10, 2017, at 4:00 PM, lewis john mcgibbney 
> > wrote:
> > >
> > > Hi Folks,
> > > Both Justin and John have provided us with +1's for releasing... which
> is
> > > quite frankly great.
> > > We've been undertaking a good bit of due diligence for this release...
> it
> > > has admittedly taken a hellish amount of time to push through. On the
> > > bright side, we have now nearly made the first official Apache release
> > > which is a huge milestone for the project and for getting the word out
> > that
> > > we are alive and kicking in the Incubator.
> > > Huge thank you to Tommaso who has been acting as release manager and
> > > community liason so to speak. It makes a huge difference and is greatly
> > > appreciated.
> > > Once Tommaso's RESULT thread hits general@ we can progress with the
> > > remaining release management items.
> > > Hopefully there will be a release announcement pretty soon.
> > > In the meantime, can everyone being thinking about appropriate avenue's
> > and
> > > communication forums for us to publicize the release announcement? If
> you
> > > could, please append them to the release management document on the
> > Joshua
> > > wiki.
> > > Best
> > > Lewis
> > >
> > >
> > > --
> > > http://home.apache.org/~lewismc/
> > > @hectorMcSpector
> > > http://www.linkedin.com/in/lmcgibbney
> >
> >
>


Re: ping on RC4 vote

2017-04-23 Thread Henry Saputra
So is the RC4 still be the release candidate for v6.1 or need to wait for
new hash checksum files to be updated?

On Thu, Apr 20, 2017 at 1:34 PM, lewis john mcgibbney 
wrote:

> PING Tommaso.
>
> On Thu, Apr 13, 2017 at 11:32 AM, lewis john mcgibbney  >
> wrote:
>
> > Hi Tommaso,
> >
> > Go for it. Let's get some more feedback and then we can take it to the
> > IPMC if the VOTE passes here.
> > Lewis
> >
> > On Mon, Apr 10, 2017 at 5:46 AM,  > incubator.apache.org> wrote:
> >
> >>
> >> thanks a lot Lewis for your in depth analysis which makes things clearer
> >> now.
> >> I can find the mentioned (wrong) binary files in the source packages on
> >> dist/dev [1] while I can't find them within the ones on the staging repo
> >> [2].
> >> So if I can copy the ones from the staging repo to dis/dev that should
> be
> >> ok, perhaps that's what I would have had to do in first place.
> >>
> >> What do you think ?
> >> Regards,
> >> Tommaso
> >>
> >> [1] : https://dist.apache.org/repos/dist/dev/incubator/joshua/6.1/
> >> [2] :
> >> https://repository.apache.org/content/repositories/orgapache
> >> joshua-1005/org/apache/joshua/joshua-incubating/6.1/
> >>
> >>
> >>
>
>
> --
> http://home.apache.org/~lewismc/
> @hectorMcSpector
> http://www.linkedin.com/in/lmcgibbney
>


Re: [VOTE] Release Apache Joshua (Incubating) 6.1

2016-11-18 Thread Henry Saputra
Sorry I was late on dev@ list for this Vote, Lewis

Looks like I have to -1 for this one:
Missing DISCLAIMER file for the source release artifact
NOTICE.txt file contains Apache HTrace instead of Apache Joshua

Minor issue:
Extra file "pom.xml.release.releaseBackup"

- Henry

On Fri, Nov 18, 2016 at 2:11 PM, lewis john mcgibbney 
wrote:

> Hello general@incubator,
> Please VOTE on the Apache Joshua 6.1 Release Candidate #1. The release VOTE
> has passed over on user@ and dev@joshua with the following results
> http://www.mail-archive.com/dev%40joshua.incubator.apache.
> org/msg01884.html.
>
> We solved 44 issues: https://s.apache.org/joshua6.1
>
> Git source tag (167489bbd78526b9833fe7c88646bf96101d5d2b):
> https://s.apache.org/joshua6.1tag
>
> Staging repo: https://repository.apache.org/content/repositories/
> orgapachejoshua-1000/
>
> Source Release Artifacts: https://dist.apache.org/repos/
> dist/dev/incubator/joshua/
>
> PGP release keys (signed using 48BAEBF6): https://dist.apache.org/repos/
> dist/release/incubator/joshua/KEYS
>
> Vote will be open for 72 hours.
> Thank you to everyone that is able to VOTE as well as everyone that
> contributed to Apache Joshua 6.1.
>
> [ ] +1, let's get it released!!!
> [ ] +/-0, fine, but consider to fix few issues before...
> [ ] -1, nope, because... (and please explain why)
>
> P.S. here is my +1
>
> --
> http://home.apache.org/~lewismc/
> @hectorMcSpector
> http://www.linkedin.com/in/lmcgibbney
>


Reduce number of branches in the source repo

2016-11-18 Thread Henry Saputra
HI All,

I think we have bit too many branches in our source repo.

Would it be easier to keep the branches to just master and releases
branches?



- Henry


Re: Language Pack English-Japanese

2016-07-22 Thread Henry Saputra
HI Toshiki,

For this kind of discussion, let's have it in the dev@ list.

You can ask the question to dev@joshua.incubator.apache.org.

Thanks,

Henry

On Thu, Jul 21, 2016 at 9:46 PM, IGA Tosiki  wrote:

> Hi Matt,
>
> Thanks for your reply!
>
> I'm happy to read your mail, I want to help you Japanese-English language
> pack.
> And YES, I mean translation memories by TMS/XLIFF. But I may convert
> TMS to what you specified format.
>
> And also I knew English to Japanese is very difficult, but also I
> believe sample of English-Japanese language pack will attract many
> Japanese people to use Joshua.
>
> Regards,
> Toshiki
>
> 2016-07-22 12:42 GMT+09:00 Matt Post :
> > Hi,
> >
> > There is no Japanese--English language pack, but I would be happy to
> build one if you could help by pointing me to data. What we need is
> parallel data in the form of sentences that are translations of each other.
> If you have access to this or pointers to where I could find some, I would
> be happy to build it. There are likely standard datasets available; people
> like Graham Neubig (http://www.phontron.com) have been working on this
> for a while.
> >
> > What are TMS and LTIFF? Are you talking about translation memories?
> >
> > As a side note, translation between English and Japanese is very
> difficult and tends not to be very good. One approach that helps is
> translating from trees and forests. Joshua does not have this capability at
> the moment.
> >
> > Sincerely,
> > matt
> >
> >
> >> On Jul 21, 2016, at 11:28 PM, IGA Tosiki  wrote:
> >>
> >> Hi team,
> >>
> >> I got interest about Joshua, and language pack. I am Japanese, and I
> >> want to know around Japanese language pack.
> >>
> >> Is there any plan about building Japanese-English language pack?
> >> I believe TMS or LTIFF will usefull to building such language pack. I
> >> have many OSS based TMS between English-Japanese. Is there any path
> >> using TMX or LTIFF for input of Joshua language pack?
> >>
> >> Best regards,
> >> Toshiki Iga
> >
>


Re: [DISCUSS] Joshua main Website redirect to wiki?

2016-07-09 Thread Henry Saputra
Need to add the incubator logo [1] as part of website branding


[1] http://incubator.apache.org/guides/branding.html

On Sat, Jul 9, 2016 at 2:55 PM, Matt Post <p...@cs.jhu.edu> wrote:

> I think the audit had to do with the lack of a disclaimer and the fact
> that we listed it as "Joshua" instead of "Apache Joshua". I fixed both of
> those.
>
> matt (from my phone)
>
> > On Jul 9, 2016, at 5:53 PM, Henry Saputra <henry.sapu...@gmail.com>
> wrote:
> >
> > Hi Matt,
> >
> > No objection from my side, I think I missed the discussion thread about
> it.
> > I sincerely apologize.
> >
> > I am looking at incubator website guide and don't see any rule about how
> > the HTML is created.
> > I asked John about the audit to see what he has to say.
> >
> >
> > - Henry
> >
> >> On Sat, Jul 9, 2016 at 2:42 PM, Matt Post <p...@cs.jhu.edu> wrote:
> >>
> >> I mentioned it a while back and no one objected, so I did it.
> >>
> >> The issue is that the GitHub approach no longer worked because Apache
> does
> >> not employ Jekyll server side, so there was a major impediment to
> editing
> >> files.
> >>
> >> I'm open to other options but this is very convenient!
> >>
> >> matt (from my phone)
> >>
> >>>> On Jul 9, 2016, at 5:31 PM, Henry Saputra <henry.sapu...@gmail.com>
> >>> wrote:
> >>>
> >>> HI All,
> >>>
> >>> I just noticed that the main Joshua website [1] is now redirect to Wiki
> >> [2].
> >>>
> >>> Was there a discussion why we are doing it this way? I remember there
> >> used
> >>> to be HTML website for the main page.
> >>>
> >>> Thanks,
> >>>
> >>> Henry
> >>>
> >>> [1] https://joshua.incubator.apache.org
> >>> [2]
> >>
> https://cwiki.apache.org/confluence/display/JOSHUA/Apache+Joshua+%28Incubating%29+Home
> >>
> >>
>
>


Re: Website Branding Issues

2016-07-09 Thread Henry Saputra
Hi John,

Was there a concern about the website be directed to Wiki?

- Henry

On Sat, Jul 9, 2016 at 11:11 AM, Matt Post  wrote:

> Hi John,
>
> I believe I just corrected this:
>
> http://joshua.incubator.apache.org
>
> However, I don't know who our TLP sponsor is, so the disclaimer is missing
> that portion of the notice. Can someone advise me here?
>
> matt
>
>
> > On Jul 9, 2016, at 11:18 AM, John D. Ament 
> wrote:
> >
> > Ping.  When can this be expected to be resolved?
> >
> > On 2016-07-01 17:55 (-0400), johndam...@apache.org wrote:
> >> Dear podling,
> >>
> >> During a recent audit of podling websites, your podling was identified
> as not including the incubating disclaimer.  This disclaimer is required on
> all podling websites, releases and announcements, to clarify that your
> project may not be in compliance with all ASF processes.
> >>
> >> Please review the Incubator's branding guide:
> http://incubator.apache.org/guides/branding.html
> >>
> >> The full list of observations for all projects can be found at:
> https://wiki.apache.org/incubator/BrandingAuditJune2016
> >>
> >> If you have any questions, feel free to respond to this email or email
> our general@ mailing list.  Please note that I am not subscribed to your
> mailing list.
> >>
>
>


Re: too many emails

2016-05-26 Thread Henry Saputra
+1 for the idea.

I think most ASF projects with github integration have similar approach.

- Henry

On Thu, May 26, 2016 at 11:53 AM, Mattmann, Chris A (3980) <
chris.a.mattm...@jpl.nasa.gov> wrote:

> Hey Matt,
>
> To be clear, I’m asking for input on a name amongst those choices.
>
> Also, we shouldn’t be archiving emails so we forget about them.
> The point is the conversation for the project should happen here
> and if it’s dev relevant conversation then it should be something
> that those that don’t have the advantage of operating at GitHub
> and believing that’s the home for the project still have a chance
> to participate by sending mails and participating in the conversation
> for the project, here.
>
> That said, I think there’s a simple solution:
>
> 1) stand up new list (we can use the mlreq program here, once
> we agree on the name)
> https://infra.apache.org/officers/mlreq/
>
>
> 2) file INFRA JIRA ticket and have ASF GitHub bot send communication
> to list from #1
>
> Make sense? Agree?
>
> Cheers,
> Chris
>
> ++
> Chris Mattmann, Ph.D.
> Chief Architect
> Instrument Software and Science Data Systems Section (398)
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 168-519, Mailstop: 168-527
> Email: chris.a.mattm...@nasa.gov
> WWW:  http://sunset.usc.edu/~mattmann/
> ++
> Director, Information Retrieval and Data Science Group (IRDS)
> Adjunct Associate Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> WWW: http://irds.usc.edu/
> ++
>
>
>
>
>
>
>
>
>
> On 5/26/16, 11:23 AM, "Matt Post"  wrote:
>
> >Chris, to be clear, are you asking for input on a name, or suggesting
> creating all three lists?
> >
> >The main issue I'm concerned with is that comments on Github generate
> three emails:
> >
> >- Github sends an email to dev
> >- If the comment is on a pull that matches a JIRA issue, the ASF Github
> bot sends an email to me
> >- It also sends the same email to dev
> >
> >I think we should just (a) tell Github to stop posting to dev and (b)
> tell the ASF Github bot to send everything to commits. We could create a
> new list, but there's some complexity in maintaining lists themselves, and
> it seems that commits would be a good place to bury things and forget about
> them.
> >
> >Would that satisfy the archiving goals? Who can do this? I don't seem to
> have Github permission on incubator-joshua to do (a), and I don't know how
> to do (b).
> >
> >matt
> >
> >
> >
> >
> >> On May 26, 2016, at 11:34 AM, kellen sunderland <
> kellen.sunderl...@gmail.com> wrote:
> >>
> >> I'd +1 as well.  Your breakdown looks good to me Chris.
> >>
> >> On Thu, May 26, 2016 at 4:12 PM, Mattmann, Chris A (3980) <
> >> chris.a.mattm...@jpl.nasa.gov> wrote:
> >>
> >>> +1 to a separate list for GitHub stuff. Many communities (Kudu,
> >>> Spark, etc.) end up doing this.
> >>>
> >>> How about:
> >>>
> >>> revi...@joshua.incubator.apache.org
> >>> git...@joshua.incubator.apache.org
> >>> iss...@joshua.incubator.apache.org
> >>>
> >>> Any of those?
> >>>
> >>> ++
> >>> Chris Mattmann, Ph.D.
> >>> Chief Architect
> >>> Instrument Software and Science Data Systems Section (398)
> >>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> >>> Office: 168-519, Mailstop: 168-527
> >>> Email: chris.a.mattm...@nasa.gov
> >>> WWW:  http://sunset.usc.edu/~mattmann/
> >>> ++
> >>> Director, Information Retrieval and Data Science Group (IRDS)
> >>> Adjunct Associate Professor, Computer Science Department
> >>> University of Southern California, Los Angeles, CA 90089 USA
> >>> WWW: http://irds.usc.edu/
> >>> ++
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On 5/26/16, 6:00 AM, "Matt Post"  wrote:
> >>>
>  I agree it's good to have Github stuff archived on Apache-owned
> domains,
> >>> I just think that the list gets overwhelmed with garbage that most
> people
> >>> are just deleting. I mean, I like the idea of skimming through
> commits, but
> >>> today I am waking up to over 100 emails, and I have to pick out the
> >>> auto-generated emails that I don't have time to read from the important
> >>> ones. If most people are just saving things to a separate folder, that
> they
> >>> are never going to read, isn't it better to turn off those auto-emails?
> 
>  Why not use a separate list like git@ or archive@ for such posts?
> Then
> >>> it's there for people to search, but no one has to wade through it.
> 
> 
> 
> 
> > On May 26, 2016, at 12:45 AM, Lewis John Mcgibbney <
> >>> lewis.mcgibb...@gmail.com> wrote:
> 

Re: too many emails

2016-05-25 Thread Henry Saputra
We can move the Github announcements to different list if it is too noisy

On Wed, May 25, 2016 at 12:48 PM, Matt Post  wrote:

> Does someone know how to turn off the mailing of all github comments to
> dev?
>
> The way I see it, we all have to be on dev, so it should be for people,
> not robots. I am getting every comment about three times.
>
> I would just do it but I don't know how.


Re: ApacheCon Meetup

2016-05-13 Thread Henry Saputra
Looks like this is not happening?

We could probably make it happen as Google hangout for next time.

- Henry

On Fri, May 13, 2016 at 11:34 AM, Henri Yandell  wrote:

> We are? :)
>
> Work meetings came up as urgent for this morning, so only just noticed
> this. :(
>
> Sorry,
>
> Hen
>
>
> On Thu, May 12, 2016 at 12:32 PM, Lewis John Mcgibbney <
> lewis.mcgibb...@gmail.com> wrote:
>
> > Hi Folks,
> > Kellen, Henri and I are going to get together tomorrow 13th around
> > lunchtime PST to talk everything Joshua.
> > Would be great to have others online via GChat if possible.
> > Let's say around 11am PST for the time being.
> > See you then folks.
> > Thanks
> > Lewis
> >
> >
> > --
> > *Lewis*
> >
>


Re: ApacheCon Meetup

2016-05-13 Thread Henry Saputra
Cool, sounds good =)

- Henry

On Thu, May 12, 2016 at 6:08 PM, kellen sunderland <
kellen.sunderl...@gmail.com> wrote:

> I just wanted to discuss it as a group.  Your approach looks good to me.
>
> On Thu, May 12, 2016 at 6:05 PM, Henry Saputra <henry.sapu...@gmail.com>
> wrote:
>
> > Ah sorry, trigger happy
> >
> > About logging. Are you proposing to use log4j interface in the code? I
> > would recommend to use slf4j [1] as facade abstraction.
> > Then implementation could be done via log4j or logback.
> >
> > Love to see API access to Joshua.
> >
> > - Henry
> >
> > [1] http://www.slf4j.org
> >
> > On Thu, May 12, 2016 at 6:03 PM, Henry Saputra <henry.sapu...@gmail.com>
> > wrote:
> >
> > > About logging. Are you proposing to use log4j interface in the code? I
> > > would recommend to use slf4j [1]
> > >
> > >
> > > [
> > >
> > > On Thu, May 12, 2016 at 2:30 PM, kellen sunderland <
> > > kellen.sunderl...@gmail.com> wrote:
> > >
> > >> Thanks for organizing Lewis,
> > >>
> > >> Here's some topics for discussion I've been noting while working with
> > >> Joshua.  None of these are high priority issues for me, but if we are
> > all
> > >> in agreement on them it might make sense to log them.
> > >>
> > >> Boring code convention stuff: Logging with log4j, throw Runtime
> > Exceptions
> > >> instead of Typed, remove all system exits (replace with
> > >> RuntimeExceptions),
> > >> refactor some large files.
> > >>
> > >> Testing: Integrate existing unit tests, provide some good test
> examples
> > so
> > >> others can begin adding more tests.
> > >>
> > >> Configuration: We also touched on IoC, CLI args, and configuration
> > changes
> > >> that are possible.
> > >>
> > >> OO stuff: Joshua is pretty good here, but I would personally prefer
> more
> > >> granular interfaces.  I wouldn't advocate radical changes, but maybe a
> > >> little refactoring might make sense to better align with the interface
> > >> segregation principle.
> > >> https://en.wikipedia.org/wiki/Interface_segregation_principle
> > >>
> > >> JNI reliance:  We've found KenLM works really well with Joshua, but
> > there
> > >> is one issue with using it.  It requires many JNI calls during
> decoding
> > >> and
> > >> these calls impact GC performance.  In fact when a JNI call happens
> the
> > GC
> > >> throws out any work it may have done and quits until the JNI call
> > >> completes.  The GC will then resume and start marking objects for
> > >> collection from scratch.  This is not ideal especially for programs
> with
> > >> large heaps (Joshua / Spark).  There's a couple ways we could mitigate
> > >> this
> > >> and I think they'd all speed up Joshua quite a lot.
> > >>
> > >> High level roadmap topics:
> > >>
> > >> *  Distributed Decoding is something I'll likely continue working on.
> > >> Theres some obvious things we can do given usage patterns of
> translation
> > >> engines that can help us out here (I think).
> > >> *  Providing a way to optimize Joshua for low-latency, low-throughput
> > >> calls
> > >> could be interesting for those with near real-time use cases.
> > Providing a
> > >> way to optimize for high-latency, high-throughput could be interesting
> > for
> > >> async/batch use cases.
> > >> *  The machine learning optimization algorithms could be cleaned up a
> > bit
> > >> (MERT/MIRA).
> > >> *  The Vocabulary could probably be replaced with a simpler
> > implementation
> > >> (without sacrificing performance).
> > >>
> > >> -Kellen
> > >>
> > >>
> > >>
> > >> On Thu, May 12, 2016 at 12:32 PM, Lewis John Mcgibbney <
> > >> lewis.mcgibb...@gmail.com> wrote:
> > >>
> > >> > Hi Folks,
> > >> > Kellen, Henri and I are going to get together tomorrow 13th around
> > >> > lunchtime PST to talk everything Joshua.
> > >> > Would be great to have others online via GChat if possible.
> > >> > Let's say around 11am PST for the time being.
> > >> > See you then folks.
> > >> > Thanks
> > >> > Lewis
> > >> >
> > >> >
> > >> > --
> > >> > *Lewis*
> > >> >
> > >>
> > >
> > >
> >
>


Re: ApacheCon Meetup

2016-05-12 Thread Henry Saputra
About logging. Are you proposing to use log4j interface in the code? I
would recommend to use slf4j [1]


[

On Thu, May 12, 2016 at 2:30 PM, kellen sunderland <
kellen.sunderl...@gmail.com> wrote:

> Thanks for organizing Lewis,
>
> Here's some topics for discussion I've been noting while working with
> Joshua.  None of these are high priority issues for me, but if we are all
> in agreement on them it might make sense to log them.
>
> Boring code convention stuff: Logging with log4j, throw Runtime Exceptions
> instead of Typed, remove all system exits (replace with RuntimeExceptions),
> refactor some large files.
>
> Testing: Integrate existing unit tests, provide some good test examples so
> others can begin adding more tests.
>
> Configuration: We also touched on IoC, CLI args, and configuration changes
> that are possible.
>
> OO stuff: Joshua is pretty good here, but I would personally prefer more
> granular interfaces.  I wouldn't advocate radical changes, but maybe a
> little refactoring might make sense to better align with the interface
> segregation principle.
> https://en.wikipedia.org/wiki/Interface_segregation_principle
>
> JNI reliance:  We've found KenLM works really well with Joshua, but there
> is one issue with using it.  It requires many JNI calls during decoding and
> these calls impact GC performance.  In fact when a JNI call happens the GC
> throws out any work it may have done and quits until the JNI call
> completes.  The GC will then resume and start marking objects for
> collection from scratch.  This is not ideal especially for programs with
> large heaps (Joshua / Spark).  There's a couple ways we could mitigate this
> and I think they'd all speed up Joshua quite a lot.
>
> High level roadmap topics:
>
> *  Distributed Decoding is something I'll likely continue working on.
> Theres some obvious things we can do given usage patterns of translation
> engines that can help us out here (I think).
> *  Providing a way to optimize Joshua for low-latency, low-throughput calls
> could be interesting for those with near real-time use cases.  Providing a
> way to optimize for high-latency, high-throughput could be interesting for
> async/batch use cases.
> *  The machine learning optimization algorithms could be cleaned up a bit
> (MERT/MIRA).
> *  The Vocabulary could probably be replaced with a simpler implementation
> (without sacrificing performance).
>
> -Kellen
>
>
>
> On Thu, May 12, 2016 at 12:32 PM, Lewis John Mcgibbney <
> lewis.mcgibb...@gmail.com> wrote:
>
> > Hi Folks,
> > Kellen, Henri and I are going to get together tomorrow 13th around
> > lunchtime PST to talk everything Joshua.
> > Would be great to have others online via GChat if possible.
> > Let's say around 11am PST for the time being.
> > See you then folks.
> > Thanks
> > Lewis
> >
> >
> > --
> > *Lewis*
> >
>


Re: [WELCOME] Felix Hieber and Kellen Sunderland to Joshua Committership

2016-04-13 Thread Henry Saputra
Welcome!

- Henry

On Tue, Apr 12, 2016 at 8:09 AM, Matt Post  wrote:

> Yes, welcome! I'm excited to have you aboard.
>
> matt
>
> > On Apr 12, 2016, at 10:05 AM, Mattmann, Chris A (3980) <
> chris.a.mattm...@jpl.nasa.gov> wrote:
> >
> > Welcome!!
> >
> > ++
> > Chris Mattmann, Ph.D.
> > Chief Architect
> > Instrument Software and Science Data Systems Section (398)
> > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> > Office: 168-519, Mailstop: 168-527
> > Email: chris.a.mattm...@nasa.gov
> > WWW:  http://sunset.usc.edu/~mattmann/
> > ++
> > Director, Information Retrieval and Data Science Group (IRDS)
> > Adjunct Associate Professor, Computer Science Department
> > University of Southern California, Los Angeles, CA 90089 USA
> > WWW: http://irds.usc.edu/
> > ++
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > On 4/11/16, 10:14 PM, "Tommaso Teofili" 
> wrote:
> >
> >> Welcome on board Felix and Kellen !
> >> Regards,
> >> Tommaso
> >>
> >> Il giorno lun 11 apr 2016 alle ore 22:09 Lewis John Mcgibbney <
> >> lewis.mcgibb...@gmail.com> ha scritto:
> >>
> >>> Hi Folks,
> >>> Glad to say that both Felix and Kellen are now on board as official
> >>> committers for Apache Joshua (Incubating).
> >>> @Felix and Kellen please feel free to say a bit about yourselves if you
> >>> feel like it.
> >>> I would like to formally welcome you to the committership. There a a
> bunch
> >>> of resources available for you both, please see [0]. If you have any
> issues
> >>> or questions then please feel free to reach out to us here.
> >>> Thank you and welcome.
> >>> Lewis
> >>>
> >>> [0] http://www.apache.org/dev/#committers
> >>>
> >>>
> >>> --
> >>> *Lewis*
> >>>
>
>


Re: Joshua development strategy

2016-04-07 Thread Henry Saputra
Sweet :)

On Wednesday, April 6, 2016, Lewis John Mcgibbney <lewis.mcgibb...@gmail.com>
wrote:

> Yep Henry np
>
> https://cwiki.apache.org/confluence/display/JOSHUA/Joshua+Meetups#JoshuaMeetups-Logistics
> We will put GHangout details in there closer to the time.
>
> On Wed, Apr 6, 2016 at 8:23 AM, Henry Saputra <henry.sapu...@gmail.com
> <javascript:;>>
> wrote:
>
> > Just realized we now have separate Apache Big Data and Apache Con so wont
> > be able to come to the Apache Con one =(
> >
> > Any chance to have like Google hangout to participate remotely? ;)
> >
> > - Henry
> >
> > On Tue, Apr 5, 2016 at 4:15 PM, Lewis John Mcgibbney <
> > lewis.mcgibb...@gmail.com <javascript:;>> wrote:
> >
> > > Sounds like the release process documentation would be a good thing to
> be
> > > addressed at ApacheCon.
> > >
> > > On Tuesday, April 5, 2016, Matt Post <p...@cs.jhu.edu <javascript:;>>
> wrote:
> > >
> > > > Hi Tom,
> > > >
> > > > Thanks for this. I adopted a simplified version of git-flow a while
> > back:
> > > >
> > > > - Trying to use branches for experimental stuff
> > > >
> > > > - Merging back into "master" when done (with --no-ff), with the rule
> > that
> > > > pushes to master should always pass all the test suites
> > > >
> > > > - Using a release branch for actual point releases.
> > > >
> > > > I'd be happy to continue this and even to do it right; it's probably
> a
> > > > good idea especially if development picks up the way it seems to be.
> > > >
> > > > Once I get to a redo of the documentation (which will probably not
> > happen
> > > > prior to ApacheCon), I can formalize this (though I'm happy if
> someone
> > > else
> > > > wants to prior to that.
> > > >
> > > > matt
> > > >
> > > >
> > > > > On Mar 29, 2016, at 3:40 PM, Tom Barber <t...@analytical-labs.com
> <javascript:;>
> > > > <javascript:;>> wrote:
> > > > >
> > > > > Moved over to dev@ with useful information still in place.
> > > > >
> > > > > Yeah I don't think the ASF is onboard yet with git pull type
> > workflows.
> > > > Its
> > > > > still mostly peer review through the reviewboard and pull request
> > > > reviews.
> > > > > I'm not saying we do away with them either, but I do think the ASF
> > > > doesn't
> > > > > make the best use of git with the forking strategy for established
> > > > > committers.
> > > > >
> > > > > Clearly if you don't have commit rights to the project it would
> need
> > to
> > > > be
> > > > > a PR/reviewboard submission anyway, but from an entirely personal
> > > > > perspective I much prefer people developing on the same repository
> > > > instead
> > > > > of github forks as it makes for much easier collaboration and
> keeping
> > > the
> > > > > code in sync. Of course you can accept pull requests onto feature
> > > > branches
> > > > > etc as well.
> > > > >
> > > > > As I said, it doesn't have to be set in stone either, as committers
> > we
> > > > just
> > > > > make sure we don't commit to the master (or other named branch)
> that
> > is
> > > > for
> > > > > releases.
> > > > >
> > > > > Even on personal forks I tend to do git flow and just push back to
> > the
> > > > > correct branch for the project.
> > > > >
> > > > > Anyway as I said just my 2 cents.
> > > > >
> > > > > On 29 March 2016 at 20:26, Henry Saputra <henry.sapu...@gmail.com
> <javascript:;>
> > > > <javascript:;>> wrote:
> > > > >
> > > > >> We could bring this discussion back to dev@ list.
> > > > >>
> > > > >> I like the git flow model too, but I don't think any other ASF
> > > projects
> > > > >> using develop branch concept. For now all PRs and patches are
> > targeted
> > > > for
> > > > >> master.
> > > > >>
> > > > >> - Henry
> > > > >>
>

Re: Joshua development strategy

2016-04-06 Thread Henry Saputra
Just realized we now have separate Apache Big Data and Apache Con so wont
be able to come to the Apache Con one =(

Any chance to have like Google hangout to participate remotely? ;)

- Henry

On Tue, Apr 5, 2016 at 4:15 PM, Lewis John Mcgibbney <
lewis.mcgibb...@gmail.com> wrote:

> Sounds like the release process documentation would be a good thing to be
> addressed at ApacheCon.
>
> On Tuesday, April 5, 2016, Matt Post <p...@cs.jhu.edu> wrote:
>
> > Hi Tom,
> >
> > Thanks for this. I adopted a simplified version of git-flow a while back:
> >
> > - Trying to use branches for experimental stuff
> >
> > - Merging back into "master" when done (with --no-ff), with the rule that
> > pushes to master should always pass all the test suites
> >
> > - Using a release branch for actual point releases.
> >
> > I'd be happy to continue this and even to do it right; it's probably a
> > good idea especially if development picks up the way it seems to be.
> >
> > Once I get to a redo of the documentation (which will probably not happen
> > prior to ApacheCon), I can formalize this (though I'm happy if someone
> else
> > wants to prior to that.
> >
> > matt
> >
> >
> > > On Mar 29, 2016, at 3:40 PM, Tom Barber <t...@analytical-labs.com
> > <javascript:;>> wrote:
> > >
> > > Moved over to dev@ with useful information still in place.
> > >
> > > Yeah I don't think the ASF is onboard yet with git pull type workflows.
> > Its
> > > still mostly peer review through the reviewboard and pull request
> > reviews.
> > > I'm not saying we do away with them either, but I do think the ASF
> > doesn't
> > > make the best use of git with the forking strategy for established
> > > committers.
> > >
> > > Clearly if you don't have commit rights to the project it would need to
> > be
> > > a PR/reviewboard submission anyway, but from an entirely personal
> > > perspective I much prefer people developing on the same repository
> > instead
> > > of github forks as it makes for much easier collaboration and keeping
> the
> > > code in sync. Of course you can accept pull requests onto feature
> > branches
> > > etc as well.
> > >
> > > As I said, it doesn't have to be set in stone either, as committers we
> > just
> > > make sure we don't commit to the master (or other named branch) that is
> > for
> > > releases.
> > >
> > > Even on personal forks I tend to do git flow and just push back to the
> > > correct branch for the project.
> > >
> > > Anyway as I said just my 2 cents.
> > >
> > > On 29 March 2016 at 20:26, Henry Saputra <henry.sapu...@gmail.com
> > <javascript:;>> wrote:
> > >
> > >> We could bring this discussion back to dev@ list.
> > >>
> > >> I like the git flow model too, but I don't think any other ASF
> projects
> > >> using develop branch concept. For now all PRs and patches are targeted
> > for
> > >> master.
> > >>
> > >> - Henry
> > >>
> > >> On Tue, Mar 29, 2016 at 12:06 PM, Tom Barber <t...@analytical-labs.com
> > <javascript:;>>
> > >> wrote:
> > >>
> > >>> To keep code stable I'm a fan of "git flow" either using the tooling
> or
> > >>> just using the methodology, that way you always have a stable branch
> to
> > >>> work off.
> > >>>
> > >>> Master branch never gets commits to it and always reflects the latest
> > >>> release
> > >>> Development branch gets sporadic commits to fix stuff or add minor
> new
> > >>> bits
> > >>> Feature-XYZ is a major new feature branch branched from development.
> > >>> Development gets merged into it to keep it in sync and when a feature
> > is
> > >>> complete and tests passing, it gets merged into development
> > >>> Hotfix-XYZ is branched from Master to provide hotfix patches and gets
> > >>> merged back into master and development.
> > >>> Release-XYZ is a release branch, minor bug fixes go into this branch
> > >>> prior to release, then gets merged back into master & development
> when
> > its
> > >>> done.
> > >>>
> > >>> This way you keep your codebase clean and works well when you have a
> > >>> bunch o

Re: Migrating Community from Github and GoggleGroups to Apache

2016-03-25 Thread Henry Saputra
Woot!

On Fri, Mar 25, 2016 at 11:22 AM, Lewis John Mcgibbney <
lewis.mcgibb...@gmail.com> wrote:

> BOOM
>
> https://github.com/apache/incubator-joshua
>
>
>
> On Thu, Mar 24, 2016 at 10:55 AM, Lewis John Mcgibbney <
> lewis.mcgibb...@gmail.com> wrote:
>
> > Thanks Matt, enjoy the time off (hopefully you are not ill)
> > Later
> >
> > On Thu, Mar 24, 2016 at 10:54 AM, Matt Post  wrote:
> >
> >> (offline yesterday and today, will do tomorrow)
> >>
> >> matt (from my phone)
> >>
> >> > On Mar 24, 2016, at 1:48 PM, Lewis John Mcgibbney <
> >> lewis.mcgibb...@gmail.com> wrote:
> >> >
> >> > Hi Matt,
> >> > As the primary figure within the Joshua community I wonder if you can
> >> act
> >> > on the following. It will go a long way in enabling us to transition
> >> things
> >> > over.
> >> >
> >> >   1. Can you create a new branch on the Github repos at [0] called
> >> >   'apache' with only a README which states that the Joshua project has
> >> been
> >> >   transitioned to the Apache Incubator. A link to the Github repos at
> >> [1]
> >> >   (I've opened [2] to get this set up) would be great. Also a link to
> >> the new
> >> >   website (once we get it up and running) at [3].
> >> >   2. Post a message to the existing Joshua Google Groups which states
> >> that
> >> >   same as the above however references the new mailing lists at
> >> >   u...@joshua.incubator.apache.org and
> dev@joshua.incubator.apache.org
> >> >   3. Post to the Joshua Website that the project has been migrated
> over
> >> to
> >> >   the Apache Software Foundation and provide relevant links as posted
> >> above.
> >> >
> >> > If we address the above it would be ideal.
> >> >
> >> > Thanks Matt, please let me know if there are any clarifications
> >> required.
> >> >
> >> > [0] https://github.com/joshua-decoder/joshua
> >> >
> >> > [1] http://github.com/apache/incubator-joshua
> >> >
> >> > [2] https://issues.apache.org/jira/browse/INFRA-11539
> >> >
> >> > [3] http://joshua.incubator.apache.org
> >> >
> >> > --
> >> > *Lewis*
> >>
> >>
> >
> >
> > --
> > *Lewis*
> >
>
>
>
> --
> *Lewis*
>