Matt,

Did you ever track down the people who contributed to the tokenizer? It
seems like we should be able to dual license that script. It would be very
nice to be able to include the Moses tokenizer and detokenizer as part of
NLTK.

Lane


On Fri, Apr 20, 2018 at 12:38 AM, liling tan <[email protected]> wrote:

> Dear Moses Devs and Community,
>
> Sorry for the delayed response.
>
> We've repackaged the MosesTokenizer Python code as a library and made it
> pip-able.
> https://github.com/alvations/sacremoses
>
> I hope that's okay with the Moses community and the license compliance is
> good with this now.
>
> Regards,
> Liling
>
>
>
> On Wed, Apr 11, 2018 at 1:41 AM, Matt Post <[email protected]> wrote:
>
>> Seems worth a shot. I suggest contacting each of them with individual
>> emails until (and if) you get a “no”.
>>
>> matt (from my phone)
>>
>> Le 10 avr. 2018 à 19:26, liling tan <[email protected]> a écrit :
>>
>> @Matt I'm not sure whether that'll work.
>>
>>
>> For tokenizer, that'll include:
>>
>>
>>    - [image: @phikoehn] phikoehn <https://github.com/phikoehn>
>>    - [image: @hieuhoang] hieuhoang <https://github.com/hieuhoang>
>>    - [image: @bhaddow] bhaddow <https://github.com/bhaddow>
>>    - [image: @jimregan] jimregan <https://github.com/jimregan>
>>    - [image: @kpu] kpu <https://github.com/kpu>
>>    - [image: @ugermann] ugermann <https://github.com/ugermann>
>>    - [image: @pjwilliams] pjwilliams <https://github.com/pjwilliams>
>>    - [image: @jgwinnup] jgwinnup <https://github.com/jgwinnup>
>>    - [image: @mhuck] mhuck <https://github.com/mhuck>
>>    - [image: @tofula] tofula <https://github.com/tofula>
>>    - [image: @a455bcd9] a455bcd9 <https://github.com/a455bcd9>
>>
>>
>> And these for the detokenizer:
>>
>> -
>> [image: @phikoehn] phikoehn <https://github.com/phikoehn>
>> - [image: @flammie] flammie <https://github.com/flammie>
>> - [image: @hieuhoang] hieuhoang <https://github.com/hieuhoang>
>> - [image: @pjwilliams] pjwilliams <https://github.com/pjwilliams>
>> - [image: @bhaddow] bhaddow <https://github.com/bhaddow>
>> - [image: @alvations] alvations <https://github.com/alvations>
>>
>> Not sure if everyone agrees though.
>>
>> Regards,
>> Liling
>>
>> On Wed, Apr 11, 2018 at 12:39 AM, Matt Post <[email protected]> wrote:
>>
>>> Liling—Would it work to get the permission of just those people who are
>>> in the commit log of the specific scripts you want to port?
>>>
>>> matt (from my phone)
>>>
>>> Le 10 avr. 2018 à 18:19, liling tan <[email protected]> a écrit :
>>>
>>> Got it.
>>>
>>> So I think we'll just remove the MosesTokenizer and MosesDetokenizer
>>> function from NLTK and maybe create a PR to put it in
>>> mosesdecoder/scripts/tokenizer
>>>
>>> Thank you for the clarification!
>>> Liling
>>>
>>> On Wed, Apr 11, 2018 at 12:17 AM, Hieu Hoang <[email protected]>
>>> wrote:
>>>
>>>> Still the same problem - everyone owns Moses so you need everyone's
>>>> permission, not just mine. So no
>>>>
>>>> Hieu Hoang
>>>> http://moses-smt.org/
>>>>
>>>>
>>>> On 10 April 2018 at 17:13, liling tan <[email protected]> wrote:
>>>>
>>>>> I understand.
>>>>>
>>>>> Could we have permission that it's okay to derive work from Moses with
>>>>> respect to the (de-)tokenizer and possibly other scripts under an
>>>>> MIT/Apache tool?
>>>>>
>>>>> Legally it's a restriction but I think for what's it worth, having
>>>>> mutual agreement between the OSS is sufficient to still keep any port of
>>>>> LGPL work until someone starts to enforce legal actions and I think it's
>>>>> safe to back off to taking down these functionalities in the Apache/MIT
>>>>> code.
>>>>>
>>>>> Regards,
>>>>> Liling
>>>>>
>>>>> On Wed, Apr 11, 2018 at 12:09 AM, Hieu Hoang <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> we can't change the license, or dual license it, without the
>>>>>> agreement of everyone who's contributed to Moses. Too much work
>>>>>>
>>>>>> Hieu Hoang
>>>>>> http://moses-smt.org/
>>>>>>
>>>>>>
>>>>>> On 10 April 2018 at 15:47, liling tan <[email protected]> wrote:
>>>>>>
>>>>>>> Dear Moses Dev,
>>>>>>>
>>>>>>> NLTK has a Python port of the word tokenizer in Moses. The tokenizer
>>>>>>> works well in Python and create a good synergy to bridge Python users to
>>>>>>> the code that Moses developers have spent years to hone.
>>>>>>>
>>>>>>> But it seemed to have hit a wall with some licensing issues.
>>>>>>> https://github.com/nltk/nltk/issues/2000
>>>>>>>
>>>>>>> General port of LGPL code is considered derivative and is
>>>>>>> incompatible with Apache or MIT license. I understand that LGPL keeps
>>>>>>> derivative from being proprietary but it's a little less permissive than
>>>>>>> non-copyleft license like Apache and MIT licenses.
>>>>>>>
>>>>>>> Note that this licensing issue might also affect Marian which is MIT
>>>>>>> license and also incompatible with LGPL so although technically users 
>>>>>>> can
>>>>>>> chain the code from different libraries, but Marian couldn't have any
>>>>>>> dependencies on the Moses components. (But we know do know that none of 
>>>>>>> our
>>>>>>> models built with Marian would work without the Moses tokenizer which 
>>>>>>> is in
>>>>>>> LGPL).
>>>>>>>
>>>>>>> Would there be a possibility to dual license the Moses repository
>>>>>>> with LGPL and Apache/BSD/MIT license. I'm not sure whether it's allowed 
>>>>>>> to
>>>>>>> have dual licenses with LGPL and Apache/BSD/MIT license though. Might 
>>>>>>> have
>>>>>>> to check with some proper legal personnel though.
>>>>>>>
>>>>>>> If dual license is not possible would it be possible relicense the
>>>>>>> code under BSD/Apache/MIT license? That way it's more permissive for
>>>>>>> derivatiive work?
>>>>>>>
>>>>>>> I think the last scenario is for NLTK to drop the Python port of
>>>>>>> Moses code entirely from Apache license repository but I think that'll
>>>>>>> remove the synergy between various OSS.
>>>>>>>
>>>>>>> Hope to hear from Moses devs soon!
>>>>>>>
>>>>>>> Regards,
>>>>>>> Liling
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Moses-support mailing list
>>>>>>> [email protected]
>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>> _______________________________________________
>>> Moses-support mailing list
>>> [email protected]
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>>>
>>
>
> On Wed, Apr 11, 2018 at 1:41 AM, Matt Post <[email protected]> wrote:
>
>> Seems worth a shot. I suggest contacting each of them with individual
>> emails until (and if) you get a “no”.
>>
>> matt (from my phone)
>>
>> Le 10 avr. 2018 à 19:26, liling tan <[email protected]> a écrit :
>>
>> @Matt I'm not sure whether that'll work.
>>
>>
>> For tokenizer, that'll include:
>>
>>
>>    - [image: @phikoehn] phikoehn <https://github.com/phikoehn>
>>    - [image: @hieuhoang] hieuhoang <https://github.com/hieuhoang>
>>    - [image: @bhaddow] bhaddow <https://github.com/bhaddow>
>>    - [image: @jimregan] jimregan <https://github.com/jimregan>
>>    - [image: @kpu] kpu <https://github.com/kpu>
>>    - [image: @ugermann] ugermann <https://github.com/ugermann>
>>    - [image: @pjwilliams] pjwilliams <https://github.com/pjwilliams>
>>    - [image: @jgwinnup] jgwinnup <https://github.com/jgwinnup>
>>    - [image: @mhuck] mhuck <https://github.com/mhuck>
>>    - [image: @tofula] tofula <https://github.com/tofula>
>>    - [image: @a455bcd9] a455bcd9 <https://github.com/a455bcd9>
>>
>>
>> And these for the detokenizer:
>>
>> -
>> [image: @phikoehn] phikoehn <https://github.com/phikoehn>
>> - [image: @flammie] flammie <https://github.com/flammie>
>> - [image: @hieuhoang] hieuhoang <https://github.com/hieuhoang>
>> - [image: @pjwilliams] pjwilliams <https://github.com/pjwilliams>
>> - [image: @bhaddow] bhaddow <https://github.com/bhaddow>
>> - [image: @alvations] alvations <https://github.com/alvations>
>>
>> Not sure if everyone agrees though.
>>
>> Regards,
>> Liling
>>
>> On Wed, Apr 11, 2018 at 12:39 AM, Matt Post <[email protected]> wrote:
>>
>>> Liling—Would it work to get the permission of just those people who are
>>> in the commit log of the specific scripts you want to port?
>>>
>>> matt (from my phone)
>>>
>>> Le 10 avr. 2018 à 18:19, liling tan <[email protected]> a écrit :
>>>
>>> Got it.
>>>
>>> So I think we'll just remove the MosesTokenizer and MosesDetokenizer
>>> function from NLTK and maybe create a PR to put it in
>>> mosesdecoder/scripts/tokenizer
>>>
>>> Thank you for the clarification!
>>> Liling
>>>
>>> On Wed, Apr 11, 2018 at 12:17 AM, Hieu Hoang <[email protected]>
>>> wrote:
>>>
>>>> Still the same problem - everyone owns Moses so you need everyone's
>>>> permission, not just mine. So no
>>>>
>>>> Hieu Hoang
>>>> http://moses-smt.org/
>>>>
>>>>
>>>> On 10 April 2018 at 17:13, liling tan <[email protected]> wrote:
>>>>
>>>>> I understand.
>>>>>
>>>>> Could we have permission that it's okay to derive work from Moses with
>>>>> respect to the (de-)tokenizer and possibly other scripts under an
>>>>> MIT/Apache tool?
>>>>>
>>>>> Legally it's a restriction but I think for what's it worth, having
>>>>> mutual agreement between the OSS is sufficient to still keep any port of
>>>>> LGPL work until someone starts to enforce legal actions and I think it's
>>>>> safe to back off to taking down these functionalities in the Apache/MIT
>>>>> code.
>>>>>
>>>>> Regards,
>>>>> Liling
>>>>>
>>>>> On Wed, Apr 11, 2018 at 12:09 AM, Hieu Hoang <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> we can't change the license, or dual license it, without the
>>>>>> agreement of everyone who's contributed to Moses. Too much work
>>>>>>
>>>>>> Hieu Hoang
>>>>>> http://moses-smt.org/
>>>>>>
>>>>>>
>>>>>> On 10 April 2018 at 15:47, liling tan <[email protected]> wrote:
>>>>>>
>>>>>>> Dear Moses Dev,
>>>>>>>
>>>>>>> NLTK has a Python port of the word tokenizer in Moses. The tokenizer
>>>>>>> works well in Python and create a good synergy to bridge Python users to
>>>>>>> the code that Moses developers have spent years to hone.
>>>>>>>
>>>>>>> But it seemed to have hit a wall with some licensing issues.
>>>>>>> https://github.com/nltk/nltk/issues/2000
>>>>>>>
>>>>>>> General port of LGPL code is considered derivative and is
>>>>>>> incompatible with Apache or MIT license. I understand that LGPL keeps
>>>>>>> derivative from being proprietary but it's a little less permissive than
>>>>>>> non-copyleft license like Apache and MIT licenses.
>>>>>>>
>>>>>>> Note that this licensing issue might also affect Marian which is MIT
>>>>>>> license and also incompatible with LGPL so although technically users 
>>>>>>> can
>>>>>>> chain the code from different libraries, but Marian couldn't have any
>>>>>>> dependencies on the Moses components. (But we know do know that none of 
>>>>>>> our
>>>>>>> models built with Marian would work without the Moses tokenizer which 
>>>>>>> is in
>>>>>>> LGPL).
>>>>>>>
>>>>>>> Would there be a possibility to dual license the Moses repository
>>>>>>> with LGPL and Apache/BSD/MIT license. I'm not sure whether it's allowed 
>>>>>>> to
>>>>>>> have dual licenses with LGPL and Apache/BSD/MIT license though. Might 
>>>>>>> have
>>>>>>> to check with some proper legal personnel though.
>>>>>>>
>>>>>>> If dual license is not possible would it be possible relicense the
>>>>>>> code under BSD/Apache/MIT license? That way it's more permissive for
>>>>>>> derivatiive work?
>>>>>>>
>>>>>>> I think the last scenario is for NLTK to drop the Python port of
>>>>>>> Moses code entirely from Apache license repository but I think that'll
>>>>>>> remove the synergy between various OSS.
>>>>>>>
>>>>>>> Hope to hear from Moses devs soon!
>>>>>>>
>>>>>>> Regards,
>>>>>>> Liling
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Moses-support mailing list
>>>>>>> [email protected]
>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>> _______________________________________________
>>> Moses-support mailing list
>>> [email protected]
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>>>
>>
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>


-- 
When a place gets crowded enough to require ID's, social collapse is not
far away.  It is time to go elsewhere.  The best thing about space travel
is that it made it possible to go elsewhere.
                -- R.A. Heinlein, "Time Enough For Love"
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to