Re: [Apertium-stuff] Special Issue on Machine Translation for Low-Resource Languages (MT Journal)

2019-11-20 Thread Ilnar Salimzianov

Hey,

I don't remember whether I have said so already, but I'm in :)

Best,

Ilnar

Am 20.11.2019 18:04 schrieb Jonathan Washington:

Hi all,

This is just a reminder that the expression of interest for this
volume is due in less than a week!

The expression of interest is easy: just a title, list of authors, and
a short description.

If anyone else would like to help out with the updated Apertium paper
that we're planning to submit, then please get in touch.

--
Jonathan

пт, 1 нояб. 2019 г. в 22:21, Jonathan Washington
:


Hi all,

Below please find a revised CFP for the Machine Translation Special
Issue on MT for Low-Resource Languages.

=
CALL FOR PAPERS: Machine Translation Journal
Special Issue on Machine Translation for Low-Resource Languages
https://www.springer.com/computer/ai/journal/10590/

GUEST EDITORS (Listed alphabetically)
• Alina Karakanta (FBK-Fondazione Bruno Kessler)
• Audrey N. Tong (NIST)
• Chao-Hong Liu (ADAPT Centre/Dublin City University)
• Ian Soboroff (NIST)
• Jonathan Washington (Swarthmore College)
• Oleg Aulov (NIST)
• Xiaobing Zhao (Minzu University of China)

Machine translation (MT) technologies have been improved significantly
in the last two decades, with developments in phrase-based statistical
MT (SMT) and recently neural MT (NMT). However, most of these methods
rely on the availability of large parallel data for training the MT
systems, resources which are not available for the majority of
language pairs, and hence current technologies often fall short in
their ability to be applied to low-resource languages. Developing MT
technologies using relatively small corpora still presents a major
challenge for the MT community. In addition, many methods for
developing MT systems still rely on several natural language
processing (NLP) tools to pre-process texts in source languages and
post-process MT outputs in target languages. The performance of these
tools often has a great impact on the quality of the resulting
translation. The availability of MT technologies and NLP tools can
facilitate equal access to information for the speakers of a language
and determine on which side of the digital divide they will end up.
The lack of these technologies for many of the world's languages
provides opportunities both for the field to grow and for making tools
available for speakers of low-resource languages.

In recent years, several workshops and evaluations have been organized
to promote research on low-resource languages. NIST has been
conducting Low Resource Human Language Technology evaluations
(LoReHLT) annually from 2016 to 2019. In LoReHLT evaluations, there is
no training data in the evaluation language. Participants receive
training data in related languages, but need to bootstrap systems in
the surprise evaluation language at the start of the evaluation.
Methods for this include pivoting approaches and taking advantage of
linguistic universals. The evaluations are supported by DARPA's Low
Resource Languages for Emergent Incidents (LORELEI) program, which
seeks to advance technologies that are less dependent on large data
resources and that can be quickly pivoted to new languages within a
very short amount of time so that information from any language can be
extracted in a timely manner to provide situation awareness to
emergent incidents. There are also the Workshop on Technologies for MT
of Low-Resource Languages (LoResMT) and the Workshop on Deep Learning
Approaches for Low-Resource Natural Language Processing (DeepLo),
which provide a venue for sharing research and working on the research
and development in this field.

This special issue solicits original research papers on MT
systems/methods and related NLP tools for low-resource languages in
general. LoReHLT, LORELEI, LoResMT and DeepLo participants are very
welcome to submit their work to the special issue. Summary papers on
MT research for specific low-resource languages, as well as extended
versions (>40% difference) of published papers from relevant
conferences/workshops are also welcome.

Topics of the special issue include but are not limited to:
 * Research and review papers of MT systems/methods for low-resource 
languages

 * Research and review papers of pre-processing and/or post-processing
NLP tools for MT
 * Word tokenizers/de-tokenizers for low-resource languages
 * Word/morpheme segmenters for low-resource languages
 * Use of morphological analyzers and/or morpheme segmenters in MT
 * Multilingual/cross-lingual NLP tools for MT
 * Review of available corpora of low-resource languages for MT
 * Pivot MT for low-resource languages
 * Zero-shot MT for low-resource languages
 * Fast building of MT systems for low-resource languages
 * Re-usability of existing MT systems and/or NLP tools for
low-resource languages
 * Machine translation for language preservation
 * Techniques that work across many languages and modalities
 * Techniques that are less dependent on large data resources
 * Use of 

Re: [Apertium-stuff] error

2019-11-20 Thread Daniel Swanson
Kiran, the issue is on line 19 of eng-hau.dix:

dog

Remove "dog" and it should work.

On Wed, Nov 20, 2019 at 12:23 PM Ngadou Yopa 
wrote:

> Hi Kiran
>
> Could you please paste the file in an online bin then send the link ?
>
> Ngadou
>
> On Wed, 20 Nov 2019 at 17:40, kiran srigiri  wrote:
>
>> Trying to make after adding words in .dix but hit with this error
>> ___
>> Apertium-stuff mailing list
>> Apertium-stuff@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>>
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Help with rule based translation

2019-11-20 Thread Hèctor Alòs i Font
Missatge de kiran srigiri  del dia dc., 20 de nov. 2019
a les 19:39:

> I have added words "I" "love" and "you" in my .dix file [eng-hau]
> now I want to know how to write rules to translate these to Hausa. I want
> this eng - hau pair to get selected in Gsoc 2020 so any help will
> be appreciated.
>
> code is at : https://github.com/kiransrigiri/apertium-eng-hau
>

Hi Kiran,

I recommend you to take a look on
http://wiki.apertium.org/wiki/Apertium_New_Language_Pair_HOWTO#Transfer_rules
(and probably to
http://wiki.apertium.org/wiki/Workflow_reference#Chunker.2FStructural_Transfer_Stage_1.2FIntra_chunk
too) and on another language pair, for instance Spanish-English. You may
try to find out how "I love you" is translated into Spanish.

For beginners (and not only), I strongly recommend to use apertium-viewer (
http://wiki.apertium.org/wiki/Apertium-viewer ). This is a tool that helps
A LOT to understand the transformations done at every step of the Apertium
pipe, i.a. in the chunker.

Hèctor
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] error

2019-11-20 Thread Ngadou Yopa
Hi Kiran

Could you please paste the file in an online bin then send the link ?

Ngadou

On Wed, 20 Nov 2019 at 17:40, kiran srigiri  wrote:

> Trying to make after adding words in .dix but hit with this error
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Special Issue on Machine Translation for Low-Resource Languages (MT Journal)

2019-11-20 Thread Jonathan Washington
Hi all,

This is just a reminder that the expression of interest for this
volume is due in less than a week!

The expression of interest is easy: just a title, list of authors, and
a short description.

If anyone else would like to help out with the updated Apertium paper
that we're planning to submit, then please get in touch.

--
Jonathan

пт, 1 нояб. 2019 г. в 22:21, Jonathan Washington
:
>
> Hi all,
>
> Below please find a revised CFP for the Machine Translation Special
> Issue on MT for Low-Resource Languages.
>
> =
> CALL FOR PAPERS: Machine Translation Journal
> Special Issue on Machine Translation for Low-Resource Languages
> https://www.springer.com/computer/ai/journal/10590/
>
> GUEST EDITORS (Listed alphabetically)
> • Alina Karakanta (FBK-Fondazione Bruno Kessler)
> • Audrey N. Tong (NIST)
> • Chao-Hong Liu (ADAPT Centre/Dublin City University)
> • Ian Soboroff (NIST)
> • Jonathan Washington (Swarthmore College)
> • Oleg Aulov (NIST)
> • Xiaobing Zhao (Minzu University of China)
>
> Machine translation (MT) technologies have been improved significantly
> in the last two decades, with developments in phrase-based statistical
> MT (SMT) and recently neural MT (NMT). However, most of these methods
> rely on the availability of large parallel data for training the MT
> systems, resources which are not available for the majority of
> language pairs, and hence current technologies often fall short in
> their ability to be applied to low-resource languages. Developing MT
> technologies using relatively small corpora still presents a major
> challenge for the MT community. In addition, many methods for
> developing MT systems still rely on several natural language
> processing (NLP) tools to pre-process texts in source languages and
> post-process MT outputs in target languages. The performance of these
> tools often has a great impact on the quality of the resulting
> translation. The availability of MT technologies and NLP tools can
> facilitate equal access to information for the speakers of a language
> and determine on which side of the digital divide they will end up.
> The lack of these technologies for many of the world's languages
> provides opportunities both for the field to grow and for making tools
> available for speakers of low-resource languages.
>
> In recent years, several workshops and evaluations have been organized
> to promote research on low-resource languages. NIST has been
> conducting Low Resource Human Language Technology evaluations
> (LoReHLT) annually from 2016 to 2019. In LoReHLT evaluations, there is
> no training data in the evaluation language. Participants receive
> training data in related languages, but need to bootstrap systems in
> the surprise evaluation language at the start of the evaluation.
> Methods for this include pivoting approaches and taking advantage of
> linguistic universals. The evaluations are supported by DARPA's Low
> Resource Languages for Emergent Incidents (LORELEI) program, which
> seeks to advance technologies that are less dependent on large data
> resources and that can be quickly pivoted to new languages within a
> very short amount of time so that information from any language can be
> extracted in a timely manner to provide situation awareness to
> emergent incidents. There are also the Workshop on Technologies for MT
> of Low-Resource Languages (LoResMT) and the Workshop on Deep Learning
> Approaches for Low-Resource Natural Language Processing (DeepLo),
> which provide a venue for sharing research and working on the research
> and development in this field.
>
> This special issue solicits original research papers on MT
> systems/methods and related NLP tools for low-resource languages in
> general. LoReHLT, LORELEI, LoResMT and DeepLo participants are very
> welcome to submit their work to the special issue. Summary papers on
> MT research for specific low-resource languages, as well as extended
> versions (>40% difference) of published papers from relevant
> conferences/workshops are also welcome.
>
> Topics of the special issue include but are not limited to:
>  * Research and review papers of MT systems/methods for low-resource languages
>  * Research and review papers of pre-processing and/or post-processing
> NLP tools for MT
>  * Word tokenizers/de-tokenizers for low-resource languages
>  * Word/morpheme segmenters for low-resource languages
>  * Use of morphological analyzers and/or morpheme segmenters in MT
>  * Multilingual/cross-lingual NLP tools for MT
>  * Review of available corpora of low-resource languages for MT
>  * Pivot MT for low-resource languages
>  * Zero-shot MT for low-resource languages
>  * Fast building of MT systems for low-resource languages
>  * Re-usability of existing MT systems and/or NLP tools for
> low-resource languages
>  * Machine translation for language preservation
>  * Techniques that work across many languages and modalities
>  * Techniques that are less dependent on large data 

[Apertium-stuff] error

2019-11-20 Thread kiran srigiri
Trying to make after adding words in .dix but hit with this error
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


[Apertium-stuff] Help with rule based translation

2019-11-20 Thread kiran srigiri
I have added words "I" "love" and "you" in my .dix file [eng-hau]
now I want to know how to write rules to translate these to Hausa. I want
this eng - hau pair to get selected in Gsoc 2020 so any help will
be appreciated.

code is at : https://github.com/kiransrigiri/apertium-eng-hau
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff