I agree. And this intense discussion is not just now in 2023, we have had it before some years ago and I found it something of a dialogue of the deaf then. Time to move that particular discussion permanently elsewhere, please!

Thanks in advance


Mike


On 30/08/2023 17:14, Sina Ahmadi via Corpora wrote:
I second Edyta's points too.

I have been on this list since 2015 and since then, the mailing list's standout feature has lied in its informative capacity to circulate calls for papers and job opportunities. While occasional "discussions" have also been a breath of fresh air, the current discourse doesn't quite align with this sentiment.

It would be more beneficial if the list could enhance its utility by containing intense discussions privately rather than disseminating them widely.

Thanks.

Best regards,
Sina Ahmadi
Postdoctoral Researcher & Adjunct Lecturer
Geroge Mason University
http://sinaahmadi.github.io/ <http://sinaahmadi.github.io/>
***On the job market!* I'm seeking out new opportunities to collaborate and innovate as a researcher and lecturer (in Europe).**
------------------------------------------------------------------------
*De :* Daniela Cesiri via Corpora <[email protected]>
*Envoyé :* mercredi 30 août 2023 11:32
*À :* Edyta Jurkiewicz-Rohrbacher <[email protected]>
*Cc :* [email protected] <[email protected]>
*Objet :* [Corpora-List] Re: RANLP 2023 Call for Participation
Dear All,

I agree with Edyta's polite remarks.

I find the discussions below purely informative posts quite confusing, and I am "losing track" of the original posts to the point that I fear I might miss calls that could be relevant for my work, or miss discussions that are worth joining. Before Edyta's remarks I was even considering leaving the list because of the current situation in the list.

So, I join Edyta's kind request to keep discussions as separate threads and leave call for papers/abstracts or job calls as purely informative posts.  Perhaps opening a new, separate discussion thread might be an alternative option that would allow us to filter the different kinds of communications we received from the list.

Best wishes to everyone,
Daniela Cesiri

Il Mer 30 Ago 2023, 17:15 Edyta Jurkiewicz-Rohrbacher via Corpora <[email protected] <mailto:[email protected]>> ha scritto:

    Dear Ada, dear all,
    I'm a bit concerned with what has been going with the list recently.
    The list, as far as I understand, serves several purposes. One of them
    is purely informative, where one informs the community about
    potentially interesting jobs, conferences etc. If I open an answer to
    a job advertisment, I expect it will be a question useful for the
    potential applicants or correction about, for example, deadlines.

     Another thing is to ask questions or start some discussions on
    various topics, either theoretical or purely practical. There I will
    expect people sharing their experience and opinions.

     What I do not find ok, is giving the feedback to purely informational
    posts in the way Ada does. In my opinion the discussions whether words
    or sentences are up-to-date concepts in any (general)linguistic or
    computational linguistic framework should be led in separate threads.
    (Notice also that the problem of text segmentation has been topic for
    already long time.) Summing up, I wouldn't mind if Adas comments were
    presented maybe privately to the authors of posts, or discussed in
    separate list-mails. Otherwise, we are facing chaos here.

    Summing up, I would be more than happy to participate, if discussions
    about the relation between linguistics and NLP took place, but not
    mixed with advertisments.

    I hope I did not offend anybody with this message.
    Best,
    Edyta Jurkiewicz-Rohrbacher

    śr., 30 sie 2023 o 16:35 Gilles Sérasset via Corpora
    <[email protected] <mailto:[email protected]>> napisał(a):
    >
    > Dear Ada, dear all,
    >
    > I am not a linguist but a computational scientist which is quite
    used to talk with (and tries to understand) linguists. I must say
    that I usually read your mails as thoroughly as my schedule and
    patience allows me to, but, to be honest, I also have a rather
    negative feeling when reading your “discourse”.
    >
    > In this discourse, I see facts + interpretation + rhetorics.
    >
    > [Here I take the risk of caricaturing for the sake of shortness,
    I hope you will understand that I have no time nor intention to
    really go deeply in all the intricacies of your different claims
    as I am more a witness than an actor of this scientific dispute]
    >
    > My understanding of your facts: Neural models do not use the
    concept of word in any of their tasks, but achieve very
    interesting results in their modelling of the language.
    >
    > My understanding of your interpretation: this is the proof that
    there is no such thing as a word.
    >
    > My understanding of your rhetoric: linguists are still using
    “words”, so they are wrong or dishonest or miseducated or dumb, we
    should wipe out entirely any occurence of this concept and start
    over with another modelling of the language.
    >
    > Please, understand that I am just presenting the way I am
    interpreting your different messages. And even if I am wrong here,
    this interpretation is to be taken into account as we are all
    persons with feeling. This feeling is a fact, even if I do not
    particularly feel targeted by your different criticisms. I hope
    this will help you ponder the terms involved in your next messages.
    >
    > This being said, I was not particularly surprised to see some
    “passionate” replies to your different messages. And I agree with
    everyone here, we should not go into such passion and use
    ad-hominem attacks on a mailing list, AND you should also
    understand that most of your rhetoric do contains such passion and
    attacks.
    >
    >
    >
    >
    > Concerning the facts :
    >
    > You are right, Neural models does not use any notion of word (or
    word morphology) as it is usually thought in linguistics as it
    usually first decide what is the granularity with which it will
    aggregate its input (sequence of characters) into tokens to which
    it attaches an “interpretation” (modelled as a multi-dimensional
    vector).
    >
    >
    >
    >
    > Concerning the interpretation :
    >
    > 1. You want to wipe out the notion of word based on such a fact.
    I would agree somehow if we were dealing with a universal
    modelling of language, but this is not the case. Human model
    language in a certain way and neural models in another way (even
    if neural networks are claimed to be inspired by biological
    neurones in our brains). The fact that a concept does not exist in
    a model does not entail that it does not exist in another model.
    >
    >
    > 2. Also, you do make the very same mistake concerning the way
    you look at the facts: i.e. there is no such thing as a
    character…, which means that the input of NN is already flown with
    a bias with which we look at language. Indeed characters are a
    very recent invention that builds on different concerns:
    >  - usual graphical elements that are traditionally used in
    language writing and that has been interpreted as atomic,
    >  - their interpretation by the encoding authorities (see the
    differences and debates about code points vs characters)
    >  - arbitrary decision made (e.g. why model A and a as 2
    different characters?)
    > Moreover, all corpora are usually badly encoded by using one
    character for another (quote instead of apostrophe, unbreakable
    character instead of a space, …) and this only accounts for
    languages with a writing system or transcription, i.e. not the
    majority of them.
    >
    > The conclusion is that even Neural Network uses artificial bias
    in the way they model language, which means that the conclusion we
    draw from them are as flawed as the one we draw from the classical
    way linguists look at languages.
    >
    >
    > 3. Most serious linguists never defined “words” lightly and most
    of them know that this concept is an "approximation” of something
    that is very difficult to apprehend and seems to be more grounded
    into linguistics from human introspection than linguistics from
    corpora. It somehow represents the way our human brain aggregates
    the atoms of the language (characters/phonemes) into something to
    which we associate an interpretation. In this sense, it is somehow
    the “tokens” of our biological neural network (and certainly far
    more).
    >
    > As an utterance production is not a bijection between whatever
    we have in our head and the sequential signal we use to
    communicate, I agree with you on the fact that “words" are
    certainly not present in a corpus (but I do think that our inner
    “tokens” may be observed somehow there).
    >
    >
    > Concerning the rhetoric:
    >
    > I do not think any linguist or computational linguist is naive
    enough to think that any of the modelling we deal with are a
    “truth” and I doubt any of them is miseducated enough to think
    that “words” are clearly defined and undoubtedly present in
    corpora. I do think though that they are usually right to observe
    occurrences (or hints) of non atomic constructs we associate with
    some interpretation. I also think that this way of looking to a
    corpus has some advantages that are not really present in NN (for
    instance, it can observe some regularity that will help human
    produce new utterances without being shown a large amount of
    examples).
    >
    > I also do think that even if you were totally right in your
    facts and interpretations, asking for a denial of current/past
    ways of looking to the texts will be a mistake. Even in physics,
    since the general theory of relativity, we know the classical
    mechanics is wrong, however it is still in use and it is not a
    problem as long as everybody know under which hypothesis it is a
    good enough approximation and under which hypothesis it does not
    work anymore.
    >
    >
    >
    > I know this message will certainly not make you think
    differently, but if it allows you to communicate differently with
    persons that still use the terms “words" or “sentences" as a
    simple shortcut to position their work into a shared/common
    understanding of the state of the art, in contexts where there is
    no room for better explanation (e.g. in summaries of their keynote
    speech), then I will have achieved something.
    >
    > Hoping this scientifical debate will continue in an appeased manner,
    >
    > Regards,
    >
    > Gilles Sérasset,
    >
    > _______________________________________________
    > Corpora mailing list -- [email protected]
    <mailto:[email protected]>
    >
    https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
    <https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/>
    > To unsubscribe send an email to [email protected]
    <mailto:[email protected]>
    _______________________________________________
    Corpora mailing list -- [email protected]
    <mailto:[email protected]>
    https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
    <https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/>
    To unsubscribe send an email to [email protected]
    <mailto:[email protected]>


Nota automatica aggiunta dal sistema di posta

*Sostieni il futuro*
Dona il tuo 5x1000 al Collegio Internazionale Ca' Foscari
*FINANZIAMENTO DELLA RICERCA SCIENTIFICA E DELLA UNIVERSITÀ | CODICE FISCALE: 80007720271*

_______________________________________________
Corpora mailing list [email protected]
https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
To unsubscribe send an email [email protected]

--
Mike Scott
lexically.net
Lexical Analysis Software and Aston University
_______________________________________________
Corpora mailing list -- [email protected]
https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
To unsubscribe send an email to [email protected]

Reply via email to