Re: Updating Incubator summary

2016-11-17 Thread John Hewitt
@Matt, that sounds like an interesting goal. What's the hook?

@Henri, that sounds good. I like the idea of showing people snippets, as MT
isn't necessarily intuitive to the average Linux.com reader.

On Thu, Nov 17, 2016 at 5:44 AM, Matt Post  wrote:

> My thinking on that roadmap was a comment Lewis made a while ago about
> incubator graduation being judged by the number of releases. If you think
> we can get out sooner, then I'm all for it! Maybe we can get the docker
> containers out and then push for it after that?
>
> I like your idea about a more concerted advertising effort. We could also
> try to pull together a demo paper for ACL   which is
> due in February. I think I might have a hook that would appeal to reviewers
> there.
>
>
> > On Nov 17, 2016, at 2:12 AM, Henri Yandell  wrote:
> >
> > Sounds good :)
> >
> > My basic mantra is 'get the summary page all signed off, then start
> asking
> > "when graduate?"'. Projects can tend to linger in the Incubator awaiting
> > perfection.
> >
> > I wonder how you could take the 3rd item (Linux.com article) and make
> that
> > bigger. Perhaps encourage every committer to write a blog post so you end
> > up with the article as an intro, and then each committer's blog entry or
> > website hosted article as a personal "how I got into this" or "what I
> work
> > on" or "a commit I recently did, a commit I keep meaning to getting
> around
> > to working on". Random thought :)
> >
> > Hen
> >
> > On Tue, Nov 15, 2016 at 11:09 AM, Matt Post  wrote:
> >
> >> We're still waiting on our first software release, so it seems to me a
> bit
> >> premature to graduate? Though I don't know how these decisions are made
> —
> >> what goes into it?
> >>
> >> Here is the roadmap that I have in mind:
> >>
> >> - 6.1 release (imminent)
> >> - Large-scale release of language packs (imminent)
> >> - Linux.com article introducing people to MT, Joshua, language packs,
> and
> >> adding custom rules
> >> - Release of docker-based language packs (including KenLM)
> >> - 7.0 release (spring)
> >> - Graduate
> >>
> >> If we keep that rough schedule, we'll have incubated a year and have a
> lot
> >> to show for it.
> >>
> >> matt
> >>
> >>
> >>> On Nov 15, 2016, at 12:13 PM, Henri Yandell  wrote:
> >>>
> >>> Thanks :)
> >>>
> >>> Reason for asking being that it felt that the standard checklist things
> >>> were complete and I was wondering what the path to graduation is?
> >>>
> >>> Any reason not to start thinking about a vote?
> >>>
> >>> On Tue, Nov 15, 2016 at 04:02 Matt Post  wrote:
> >>>
>  Thanks, Lewis, and Henri, for pointing this out.
> 
> 
> > On Nov 15, 2016, at 1:18 AM, lewis john mcgibbney <
> lewi...@apache.org>
>  wrote:
> >
> > Hi Henri,
> > I just pushed the update to SVN. Should update asynch reasonably
> soon.
> >
> > http://incubator.apache.org/projects/joshua.html
> >
> > Thanks
> >
> > On Sun, Nov 13, 2016 at 1:22 PM, <
> > dev-digest-h...@joshua.incubator.apache.org> wrote:
> >
> >>
> >> From: Henri Yandell 
> >> To: dev@joshua.incubator.apache.org
> >> Cc:
> >> Date: Sun, 13 Nov 2016 01:17:57 -0800
> >> Subject: Updating Incubator summary
> >> Would be useful to update this page:
> >>
> >> http://incubator.apache.org/projects/joshua.html
> >>
> >>
> >> Are there any of the checklist items that are still open?
> >>
> >>
> > As far as I am aware no :)
> 
> 
> >>
> >>
>
>


[jira] [Commented] (JOSHUA-315) Thrax keeps all rules

2016-11-17 Thread Matt Post (JIRA)

[ 
https://issues.apache.org/jira/browse/JOSHUA-315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15673452#comment-15673452
 ] 

Matt Post commented on JOSHUA-315:
--

Yeah, I had expected a bigger savings, too. I should quantify it in terms of 
runtime, as well.

> Thrax keeps all rules
> -
>
> Key: JOSHUA-315
> URL: https://issues.apache.org/jira/browse/JOSHUA-315
> Project: Joshua
>  Issue Type: Bug
>Reporter: Matt Post
> Fix For: 6.2
>
>
> When extracting rules, Thrax keeps *all* options for each target side. For 
> large bitexts and common source sides (e.g., "de" for Spanish–English), there 
> can be tens of thousands of translations, due to errors in the alignments and 
> phenomena like garbage collection. The decoder throws out all but the top 
> num_translation_options of these (default 20), but before doing so, it has to 
> score all the target side options with all feature functions, include the 
> language model. This slows down "warming up" of the model and means that the 
> first sentences to use these items are very slow to translation.
> I have updated scripts/training/filter-rules.pl to filter out using Thrax's 
> rarity penalty field, but it would be much better if Thrax were to keep only 
> the most 100 frequent translation options for each source side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: "mvn assembly" no longer works

2016-11-17 Thread Matt Post
Ah, thanks Lewis. I did update the README to mention the new package target.



> On Nov 17, 2016, at 1:36 AM, lewis john mcgibbney  wrote:
> 
> Hi Matt,
> Again, I am on digest and didn't receive but I'll reply here.
> No need to use the Maven assembly plugin anymore... simply execute mvn 
> package... you will then see 
> ./target/joshua-6.2-SNAPSHOT-jar-with-dependencies.jar the exact same, but 
> now a default Maven task rather than a custom plugin implementation.
> Do we need to update README?
> 
> -- 
> http://home.apache.org/~lewismc/ 
> @hectorMcSpector
> http://www.linkedin.com/in/lmcgibbney 



Re: Updating Incubator summary

2016-11-17 Thread Matt Post
My thinking on that roadmap was a comment Lewis made a while ago about 
incubator graduation being judged by the number of releases. If you think we 
can get out sooner, then I'm all for it! Maybe we can get the docker containers 
out and then push for it after that?

I like your idea about a more concerted advertising effort. We could also try 
to pull together a demo paper for ACL   which is due in 
February. I think I might have a hook that would appeal to reviewers there.


> On Nov 17, 2016, at 2:12 AM, Henri Yandell  wrote:
> 
> Sounds good :)
> 
> My basic mantra is 'get the summary page all signed off, then start asking
> "when graduate?"'. Projects can tend to linger in the Incubator awaiting
> perfection.
> 
> I wonder how you could take the 3rd item (Linux.com article) and make that
> bigger. Perhaps encourage every committer to write a blog post so you end
> up with the article as an intro, and then each committer's blog entry or
> website hosted article as a personal "how I got into this" or "what I work
> on" or "a commit I recently did, a commit I keep meaning to getting around
> to working on". Random thought :)
> 
> Hen
> 
> On Tue, Nov 15, 2016 at 11:09 AM, Matt Post  wrote:
> 
>> We're still waiting on our first software release, so it seems to me a bit
>> premature to graduate? Though I don't know how these decisions are made —
>> what goes into it?
>> 
>> Here is the roadmap that I have in mind:
>> 
>> - 6.1 release (imminent)
>> - Large-scale release of language packs (imminent)
>> - Linux.com article introducing people to MT, Joshua, language packs, and
>> adding custom rules
>> - Release of docker-based language packs (including KenLM)
>> - 7.0 release (spring)
>> - Graduate
>> 
>> If we keep that rough schedule, we'll have incubated a year and have a lot
>> to show for it.
>> 
>> matt
>> 
>> 
>>> On Nov 15, 2016, at 12:13 PM, Henri Yandell  wrote:
>>> 
>>> Thanks :)
>>> 
>>> Reason for asking being that it felt that the standard checklist things
>>> were complete and I was wondering what the path to graduation is?
>>> 
>>> Any reason not to start thinking about a vote?
>>> 
>>> On Tue, Nov 15, 2016 at 04:02 Matt Post  wrote:
>>> 
 Thanks, Lewis, and Henri, for pointing this out.
 
 
> On Nov 15, 2016, at 1:18 AM, lewis john mcgibbney 
 wrote:
> 
> Hi Henri,
> I just pushed the update to SVN. Should update asynch reasonably soon.
> 
> http://incubator.apache.org/projects/joshua.html
> 
> Thanks
> 
> On Sun, Nov 13, 2016 at 1:22 PM, <
> dev-digest-h...@joshua.incubator.apache.org> wrote:
> 
>> 
>> From: Henri Yandell 
>> To: dev@joshua.incubator.apache.org
>> Cc:
>> Date: Sun, 13 Nov 2016 01:17:57 -0800
>> Subject: Updating Incubator summary
>> Would be useful to update this page:
>> 
>> http://incubator.apache.org/projects/joshua.html
>> 
>> 
>> Are there any of the checklist items that are still open?
>> 
>> 
> As far as I am aware no :)
 
 
>> 
>>