Re: [ANNOUNCE] - Apache Joshua 6.1 incubating release

2017-06-28 Thread Matt Post
Yes, tighter integration with other Apache projects sounds like a good idea to 
me. Rewriting Thrax to use a more modern tool would also be hugely helpful to 
Joshua in the long term. It is getting harder and harder to find and maintain 
(much less justify) Hadoop clusters that are separate from other research ones.


> On Jun 28, 2017, at 3:42 AM, Tommaso Teofili  
> wrote:
> 
> +1
> 
> Tommaso
> 
> Il giorno mer 28 giu 2017 alle ore 07:46 lewis john mcgibbney <
> lewi...@apache.org> ha scritto:
> 
>> Hi Suneel,
>> I think it's worth opening a JIRA issue and we can possibly mark it for
>> 7.X?
>> lewis
>> 
>> On Tue, Jun 27, 2017 at 9:36 PM, <
>> dev-digest-h...@joshua.incubator.apache.org> wrote:
>> 
>>> 
>>> From: Suneel Marthi 
>>> To: dev@joshua.incubator.apache.org
>>> Cc:
>>> Bcc:
>>> Date: Fri, 23 Jun 2017 01:59:28 -0400
>>> Subject: Re: [ANNOUNCE] - Apache Joshua 6.1 incubating release
>>> Congrats on the release.
>>> 
>>> I have been a silent lurker on this channel since I first heard of Joshua
>>> last September at Amazon, Berlin.
>>> 
>>> Tommaso and myself recently did a talk at Berlin Buzzwords 2017 -
>>> 'Embracing Diversity - searching over multiple languages' [1]
>>> using Apache Joshua for Machine Translation, and Apache OpenNLP for
>>> Language detection.
>>> 
>>> I have been wondering how much of the present VLPS can be replaced by
>>> OpenNLP with Flink/Beam pipelines.
>>> I did a talk last week at Hadoop Summit, San Jose about 'Large Scale Text
>>> processing with Apache OpenNLP and Apache Flink [2].
>>> 
>>> Also that Thrax which is presently MapReduce based, can definitely be
>>> ported over to modern streaming distributed frameworks like Flink/Kafka
>>> Streams/Beam.
>>> 
>>> 
>>> [1]
>>> https://www.youtube.com/watch?v=ZrWxySF-9KY=20=2s;
>>> list=PLq-odUc2x7i-9Nijx-WfoRMoAfHC9XzTt
>>> [2] https://www.slideshare.net/SuneelMarthi/large-scale-text-processing
>>> 
>>> 
>>> 
>> 



Re: Merging 7.X into master??? + cleaning up branches

2017-06-28 Thread Matt Post
This is definitely a good idea. Many of these branches are dead and are 
unlikely to contain much that can be merged in, and are therefore probably best 
deleted. The plan for 7 was a big simplification of much of the guts, but with 
the transition to neural approaches in the research community, this is unlikely 
to be done unless it finds a new champion.




> On Jun 28, 2017, at 3:43 AM, Tommaso Teofili  
> wrote:
> 
> +1 for both cleaning up branches *and* merging 7 branch into master.
> 
> Regarding branches and Git let me read through the links and I'll share my
> opinion.
> 
> Regards,
> Tommaso
> 
> Il giorno mer 28 giu 2017 alle ore 06:41 Chris Mattmann 
> ha scritto:
> 
>> Hey Team,
>> 
>> I recommend that Joshua consider adopting the Tika and/or Nutch
>> contribution
>> policy RE: branches and Git:
>> 
>> https://github.com/apache/tika/#contributing-via-github
>> https://github.com/apache/nutch/#contributing
>> 
>> Cheers,
>> Chris
>> 
>> 
>> 
>> On 6/27/17, 9:36 PM, "lewis john mcgibbney"  wrote:
>> 
>>Hi Folks,
>>Two things...
>> 
>>   1. Currently the branches for Joshua are a bit of a mess... it
>> would be
>>   better if they were named after JIRA issues such that the mappings
>> back to
>>   some concrete development were explicit. Does anyone want to clean
>> these up?
>>   2. Now that 6.1-incubating is released and live, Is there any
>> desire to
>>   merge 7.X branch into master and continue development there? I was
>> not
>>   involved with the 7.X development but it looked like a significant
>> step
>>   forward... it would be a shame for that work to stagnate.
>> 
>>Thanks,
>> 
>>lewis
>> 
>>--
>>http://home.apache.org/~lewismc/
>>@hectorMcSpector
>>http://www.linkedin.com/in/lmcgibbney
>> 
>> 
>> 
>> 



Re: Merging 7.X into master??? + cleaning up branches

2017-06-28 Thread Tommaso Teofili
+1 for both cleaning up branches *and* merging 7 branch into master.

Regarding branches and Git let me read through the links and I'll share my
opinion.

Regards,
Tommaso

Il giorno mer 28 giu 2017 alle ore 06:41 Chris Mattmann 
ha scritto:

> Hey Team,
>
> I recommend that Joshua consider adopting the Tika and/or Nutch
> contribution
> policy RE: branches and Git:
>
> https://github.com/apache/tika/#contributing-via-github
> https://github.com/apache/nutch/#contributing
>
> Cheers,
> Chris
>
>
>
> On 6/27/17, 9:36 PM, "lewis john mcgibbney"  wrote:
>
> Hi Folks,
> Two things...
>
>1. Currently the branches for Joshua are a bit of a mess... it
> would be
>better if they were named after JIRA issues such that the mappings
> back to
>some concrete development were explicit. Does anyone want to clean
> these up?
>2. Now that 6.1-incubating is released and live, Is there any
> desire to
>merge 7.X branch into master and continue development there? I was
> not
>involved with the 7.X development but it looked like a significant
> step
>forward... it would be a shame for that work to stagnate.
>
> Thanks,
>
> lewis
>
> --
> http://home.apache.org/~lewismc/
> @hectorMcSpector
> http://www.linkedin.com/in/lmcgibbney
>
>
>
>


Re: [ANNOUNCE] - Apache Joshua 6.1 incubating release

2017-06-28 Thread Tommaso Teofili
+1

Tommaso

Il giorno mer 28 giu 2017 alle ore 07:46 lewis john mcgibbney <
lewi...@apache.org> ha scritto:

> Hi Suneel,
> I think it's worth opening a JIRA issue and we can possibly mark it for
> 7.X?
> lewis
>
> On Tue, Jun 27, 2017 at 9:36 PM, <
> dev-digest-h...@joshua.incubator.apache.org> wrote:
>
> >
> > From: Suneel Marthi 
> > To: dev@joshua.incubator.apache.org
> > Cc:
> > Bcc:
> > Date: Fri, 23 Jun 2017 01:59:28 -0400
> > Subject: Re: [ANNOUNCE] - Apache Joshua 6.1 incubating release
> > Congrats on the release.
> >
> > I have been a silent lurker on this channel since I first heard of Joshua
> > last September at Amazon, Berlin.
> >
> > Tommaso and myself recently did a talk at Berlin Buzzwords 2017 -
> > 'Embracing Diversity - searching over multiple languages' [1]
> > using Apache Joshua for Machine Translation, and Apache OpenNLP for
> > Language detection.
> >
> > I have been wondering how much of the present VLPS can be replaced by
> > OpenNLP with Flink/Beam pipelines.
> > I did a talk last week at Hadoop Summit, San Jose about 'Large Scale Text
> > processing with Apache OpenNLP and Apache Flink [2].
> >
> > Also that Thrax which is presently MapReduce based, can definitely be
> > ported over to modern streaming distributed frameworks like Flink/Kafka
> > Streams/Beam.
> >
> >
> > [1]
> > https://www.youtube.com/watch?v=ZrWxySF-9KY=20=2s;
> > list=PLq-odUc2x7i-9Nijx-WfoRMoAfHC9XzTt
> > [2] https://www.slideshare.net/SuneelMarthi/large-scale-text-processing
> >
> >
> >
>