Re: [Apertium-stuff] Extend weighted transfer rules GSoC proposal

2019-04-19 Thread Aboelhamd Aly
According to the timeline I put in my proposal, I am supposed to start
phase 1 today.
I want to know which procedures to do to document my work, day by day and
week by week.
Do I create a page in wiki to save my progress ?
Or is there another way ?

Thanks

On Fri, Apr 19, 2019 at 9:27 PM Aboelhamd Aly 
wrote:

> Hi Sevilay. Hi Francis,
>
> Unfortunately, Sevilay reported that the evaluation results of kaz-tur and
> spa-eng pairs were very bad with 30% of the tested sentences were good,
> compared to apertium LRLM resolution.
> So we discussed what to do next and it is to utilize the breakthrough of
> deep learning neural networks in NLP and especially machine translations.
> Also we discussed about using different values of n more than 5 in the
> already used n-gram language model. And to evaluate the result of
> increasing value of n, which could give us some more insights in what to do
> next and how to do it.
>
> Since I have an intro to deep learning subject this term in college, I
> waited this past two weeks to be introduced to the application of deep
> learning in NLP and MTs.
> Now, I have the basics of knowledge in Recurrent Neural Networks (RNNs)
> and why to use it instead of the standard network in NLP, beside
> understanding the different architectures of it and the math done in the
> forward and back propagation.
> Also besides knowing how to build a simple language model, and avoiding
> the problem of (vanishing gradient) leading to not capturing long
> dependencies, by using Gated Recurrent Units (GRus) and Long Short Term
> Memory (LSTM) network.
>
> For next step, we will consider working only on the language model and to
> let the max entropy part for later discussions.
> So along with trying different n values in the n-gram language model and
> evaluate the results, I will try either to use a ready RNNLM or to build a
> new one from scratch from what I learnt so far. Honestly I prefer the last
> choice because it will increase my experience in applying what I have
> learnt.
> In last 2 weeks I implemented RNNs with GRUs and LSTM and also implemented
> a character based language model as two assignments and they were very fun
> to do. So implementing a RNNs word based character LM will not take much
> time, though it may not be close to the state-of-the-art model and this is
> the disadvantage of it.
>
> Using NNLM instead of the n-gram LM has these possible advantages :
> - Automatically learn such syntactic and semantic features.
> - Overcome the curse of dimensionality by generating better
> generalizations.
>
> --
>
> I tried using n=8 instead of 5 in the n-gram LM, but the scores weren't
> that different as Sevilay pointed out in our discussion.
> I knew that NNLM is better than statistical one, also that using machine
> learning instead of maximum entropy model will give better performance.
> *But* the evaluation results were very very disappointing, unexpected and
> illogical, so I thought there might be a bug in the code.
> And after some search, I found that I did a very very silly *mistake* in
> normalizing the LM scores. As the scores are log base 10 of the sentence
> probability, then the higher in magnitude has the lower probability, but I
> what I did was the inverse of that, and that was the cause of the very bad
> results.
>
> I am fixing this now and then will re-evaluate the results with Sevilay.
>
> Regards,
> Aboelhamd
>
>
> On Sun, Apr 7, 2019 at 6:46 PM Aboelhamd Aly 
> wrote:
>
>> Thanks Sevilay for your feedback, and thanks for the resources.
>>
>> On Sun, 7 Apr 2019, 18:42 Sevilay Bayatlı > wrote:
>>
>>> hi Aboelhamd,
>>>
>>> Your proposal looks good, I found these resource may be will be benefit.
>>>
>>>
>>>
>>> 
>>> Multi-source *neural translation* 
>>> https://arxiv.org/abs/1601.00710
>>>
>>>
>>> 
>>> *Neural machine translation *with extended context
>>> 
>>> https://arxiv.org/abs/1708.05943
>>>
>>> Handling homographs in *neural machine translation*
>>> https://arxiv.org/abs/1708.06510
>>>
>>>
>>>
>>> Sevilay
>>>
>>> On Sun, Apr 7, 2019 at 7:14 PM Aboelhamd Aly <
>>> aboelhamd.abotr...@gmail.com> wrote:
>>>
 Hi all,

 I got a not solid yet idea as an alternative to yasmet and max entropy
 models.
 And it's by using neural networks to give us scores for the ambiguous
 rules.
 But I didn't yet set a formulation for the problem nor the structure of
 the inputs, output and even the goal.
 As I think there are many formulations that we can adopt.

 For example, the most straightforward structure, is to give the network
 all the possible combinations
 of a sentence translations and let it choose the best one, or give them
 weights.
 Hence, make the network learns which combinations to choose 

Re: [Apertium-stuff] Extend weighted transfer rules GSoC proposal

2019-04-19 Thread Aboelhamd Aly
Hi Sevilay. Hi Francis,

Unfortunately, Sevilay reported that the evaluation results of kaz-tur and
spa-eng pairs were very bad with 30% of the tested sentences were good,
compared to apertium LRLM resolution.
So we discussed what to do next and it is to utilize the breakthrough of
deep learning neural networks in NLP and especially machine translations.
Also we discussed about using different values of n more than 5 in the
already used n-gram language model. And to evaluate the result of
increasing value of n, which could give us some more insights in what to do
next and how to do it.

Since I have an intro to deep learning subject this term in college, I
waited this past two weeks to be introduced to the application of deep
learning in NLP and MTs.
Now, I have the basics of knowledge in Recurrent Neural Networks (RNNs) and
why to use it instead of the standard network in NLP, beside understanding
the different architectures of it and the math done in the forward and back
propagation.
Also besides knowing how to build a simple language model, and avoiding the
problem of (vanishing gradient) leading to not capturing long dependencies,
by using Gated Recurrent Units (GRus) and Long Short Term Memory (LSTM)
network.

For next step, we will consider working only on the language model and to
let the max entropy part for later discussions.
So along with trying different n values in the n-gram language model and
evaluate the results, I will try either to use a ready RNNLM or to build a
new one from scratch from what I learnt so far. Honestly I prefer the last
choice because it will increase my experience in applying what I have
learnt.
In last 2 weeks I implemented RNNs with GRUs and LSTM and also implemented
a character based language model as two assignments and they were very fun
to do. So implementing a RNNs word based character LM will not take much
time, though it may not be close to the state-of-the-art model and this is
the disadvantage of it.

Using NNLM instead of the n-gram LM has these possible advantages :
- Automatically learn such syntactic and semantic features.
- Overcome the curse of dimensionality by generating better generalizations.

--

I tried using n=8 instead of 5 in the n-gram LM, but the scores weren't
that different as Sevilay pointed out in our discussion.
I knew that NNLM is better than statistical one, also that using machine
learning instead of maximum entropy model will give better performance.
*But* the evaluation results were very very disappointing, unexpected and
illogical, so I thought there might be a bug in the code.
And after some search, I found that I did a very very silly *mistake* in
normalizing the LM scores. As the scores are log base 10 of the sentence
probability, then the higher in magnitude has the lower probability, but I
what I did was the inverse of that, and that was the cause of the very bad
results.

I am fixing this now and then will re-evaluate the results with Sevilay.

Regards,
Aboelhamd


On Sun, Apr 7, 2019 at 6:46 PM Aboelhamd Aly 
wrote:

> Thanks Sevilay for your feedback, and thanks for the resources.
>
> On Sun, 7 Apr 2019, 18:42 Sevilay Bayatlı 
>> hi Aboelhamd,
>>
>> Your proposal looks good, I found these resource may be will be benefit.
>>
>>
>>
>> 
>> Multi-source *neural translation* 
>> https://arxiv.org/abs/1601.00710
>>
>>
>> 
>> *Neural machine translation *with extended context
>> 
>> https://arxiv.org/abs/1708.05943
>>
>> Handling homographs in *neural machine translation*
>> https://arxiv.org/abs/1708.06510
>>
>>
>>
>> Sevilay
>>
>> On Sun, Apr 7, 2019 at 7:14 PM Aboelhamd Aly <
>> aboelhamd.abotr...@gmail.com> wrote:
>>
>>> Hi all,
>>>
>>> I got a not solid yet idea as an alternative to yasmet and max entropy
>>> models.
>>> And it's by using neural networks to give us scores for the ambiguous
>>> rules.
>>> But I didn't yet set a formulation for the problem nor the structure of
>>> the inputs, output and even the goal.
>>> As I think there are many formulations that we can adopt.
>>>
>>> For example, the most straightforward structure, is to give the network
>>> all the possible combinations
>>> of a sentence translations and let it choose the best one, or give them
>>> weights.
>>> Hence, make the network learns which combinations to choose for a
>>> specific pair.
>>>
>>> Another example, is instead of building one network per pair,
>>> we build one network per ambiguous pattern as we did with max entropy
>>> models.
>>> So we give to the network the combinations for that pattern,
>>> and let it assign some weights for the ambiguous rules applied to that
>>> pattern.
>>>
>>> And for each structure there are many details and questions to yet
>>> answer.
>>>
>>> So with that said, I decided to look at some papers to see 

Re: [Apertium-stuff] Converting wiki to README

2019-04-19 Thread Amr Mohamed Hosny Anwar
Hi all,

I noticed that some of the repositories on github have their readme files 
written in markdown and others have it written in another format(similar to the 
wiki's format but I don't know it).

Wouldn't it be better if we can write a script that converts them all to 
markdown and create a pull request in each repo so that the admin of each repo 
can validate the readme file that is written using markdown then accept the 
request?

Amr

From: Sandy 
Sent: Friday, April 19, 2019 2:45:04 PM
To: apertium-stuff@lists.sourceforge.net
Subject: Re: [Apertium-stuff] Converting wiki to README

Exactly what I needed. Thank you very much :)

On Fri, Apr 19, 2019 at 5:30 PM Ilnar Salimzianov  wrote:
>
> Hey Sandy,
>
> This might be what you're looking for:
>
> https://stackoverflow.com/questions/9824489/any-markdown-to-wikimarkup-converter-available
>
> Best,
>
> selimcan
>
> On 4/19/19 2:43 PM, Sandy wrote:
> > Is there an existing script or tool to convert a wiki into a README? I
> > tried looking for one but wasn't able to find it. Or is it necessary
> > to do it manually?
> >
> >
> > ___
> > Apertium-stuff mailing list
> > Apertium-stuff@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> >
>
> --
> GPG: 0xF3ED6A19
>
>
>
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff


___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff
___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Converting wiki to README

2019-04-19 Thread Sandy
Exactly what I needed. Thank you very much :)

On Fri, Apr 19, 2019 at 5:30 PM Ilnar Salimzianov  wrote:
>
> Hey Sandy,
>
> This might be what you're looking for:
>
> https://stackoverflow.com/questions/9824489/any-markdown-to-wikimarkup-converter-available
>
> Best,
>
> selimcan
>
> On 4/19/19 2:43 PM, Sandy wrote:
> > Is there an existing script or tool to convert a wiki into a README? I
> > tried looking for one but wasn't able to find it. Or is it necessary
> > to do it manually?
> >
> >
> > ___
> > Apertium-stuff mailing list
> > Apertium-stuff@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/apertium-stuff
> >
>
> --
> GPG: 0xF3ED6A19
>
>
>
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff


___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


Re: [Apertium-stuff] Converting wiki to README

2019-04-19 Thread Ilnar Salimzianov
Hey Sandy,

This might be what you're looking for:

https://stackoverflow.com/questions/9824489/any-markdown-to-wikimarkup-converter-available

Best,

selimcan

On 4/19/19 2:43 PM, Sandy wrote:
> Is there an existing script or tool to convert a wiki into a README? I
> tried looking for one but wasn't able to find it. Or is it necessary
> to do it manually?
>
>
> ___
> Apertium-stuff mailing list
> Apertium-stuff@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/apertium-stuff
>

-- 
GPG: 0xF3ED6A19



___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff


[Apertium-stuff] Converting wiki to README

2019-04-19 Thread Sandy
Is there an existing script or tool to convert a wiki into a README? I
tried looking for one but wasn't able to find it. Or is it necessary
to do it manually?


___
Apertium-stuff mailing list
Apertium-stuff@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/apertium-stuff