Hi,

Firstly, you should write to the CG mailing list instead of CC'ing us all - 
https://groups.google.com/forum/#!forum/constraint-grammar / 
[email protected]<mailto:[email protected]> 
- I have done so with this reply.

Sorry for my ignorance. I didn’t know about this e-mail list as I am not 
attached to this community. Anssi, are you on that list? This seems to be a 
good way of discussing issues related to the project we are planning. It’s 
probably good if you join in case you are not part of it yet. I will probably 
try to avoid to join yet another mailing list but you could keep me in the cc 
with relevant messages. Thanks!


Anyway, I have been saying this for a long time. The past decade of machine 
learning has simply approached the hand-written method. E.g., when Google 
published 
https://research.googleblog.com/2016/05/announcing-syntaxnet-worlds-most.html 
my immediate comment was: Their induced rules smell an awful lot like 
constraint grammar, just expressed in vector fields.

There is nothing controversial about it. Humans learn languages from data and 
experience and then write down their rules in terms of grammars and lexica. 
Machines do the same and optimise parameters of their models (no matter whether 
they are rules, lexica or vector-based representations). However, humans are 
not good in expressing uncertainty and quantifying subjective interpretation. 
That’s why I would not go so far and claim that ML models approach hand-written 
methods but hopefully they are able to include the regularities that humans are 
capable of discovering. That’s exactly what we would like to study - in what 
way machine learning is capable of discovering regularities that humans 
identify (e.g. constraints in CG). We do not expect that they will match 
exactly especially because CG grammars and neural LMs are developed/trained for 
different tasks but we hope to see some correlations. So, any help and 
suggestions on how to study this question would be very much appreciated.


Other papers in the field even explicitly say that their models look a lot more 
like classic systems, with separate source language analysis, transfer, and 
target language generation. And this holds for any text-to-text transformation, 
not just translation. So I am not the least bit surprised that more advanced 
models look more and more like rule-based systems.

True, encoder-decoder models are exactly using that kind of architecture and 
the success of pre-trained neural language models shows the importance of 
generic language learning before applying it to other downstream tasks. This is 
nothing new, even n-gram LMs have been generic tools that came in handy for all 
kinds of downstream applications. However, even though the overall architecture 
might look the same, I would not claim that they look more and more like 
rule-based systems. Statistical machine translation is a rule-based approach 
with large probabilistic phrase-based or even tree-based translation rules. 
There are no more rules in neural MT and other neural models. So, the shift to 
non-rule-based systems just happened with the deep learning wave and not before.

In any case, I really welcome further discussions about conceptual and internal 
similarities / differences as we really need this to understand what is going 
on in the models we develop no matter whether they are hand-written, rule based 
or neural. Thanks in advance for any feedback and comments in that direction.

All the best,
Jörg



-- Tino Didriksen


On Wed, 26 Feb 2020 at 14:26, Tiedemann, Jörg 
<[email protected]<mailto:[email protected]>> wrote:
Dear CG community,


I am reaching out to you because we have the idea to follow-up on Anssi 
Yli-Jyrä’s ideas on comparing CG to transformer models to see whether there is 
some commonalities between expert-made linguistic grammars and learned neural 
language models. This is some kind of fascinating question and we would like to 
carry out some empirical studies to find possible correlations and patterns.

It would be great to get an update about available CG resources to get started 
and it would also be interesting to hear whether anyone of you would be 
interested to even collaborate in that study. What I had in mind was to look 
into the disambiguation process done on real-world data using CG-based parsers 
and compare that with the activations triggered in trained neural language 
models.

It would be excellent to know whether there are some (hopefully freely 
available) wide-coverage grammars and parsers available that we can study. Most 
likely, we need to look into high-resource languages (including Finnish( to 
also make proper comparisons to neural models but other scenarios are possible 
as well. Please, let me and Anssi know whether you have any suggestions. Thanks 
a ot!


All the best,
Jörg

-- 
You received this message because you are subscribed to the Google Groups 
"Constraint Grammar" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/constraint-grammar/E715C266-CC5E-4CF0-AE57-4D812BBD62EC%40helsinki.fi.

Reply via email to