Linguistics is not "supposed to be about spoken languages" --- that has
just been a disciplinary bias. And the term "spoken languages" is also not
quite correct. There are scripts, there are sound, there are signs --- all
of which are* ways in which we express ourselves and the way we interpret
information*. Such "ways" can be regarded as language (as in, as has been
traditionally and commonly known), g-language*. P-language can be thought
of as an instance of such (mixed with some historical contexts, but also A
LOT OF identity politics and language attitudes).

*g-language as general/generalized language or "language-at-large" (or in
Haspelmath's formulation "general phenomenon of Human Language"),
p-language as "particular language(s)" such as those commonly referred to
e.g. FR, EN....
See also Haspelmath (2019). Confusing p-linguistics and g-linguistics:
Philosopher Ludlow on “framework-free theory".
https://dlc.hypotheses.org/1801. Note though that I, by "decomposing
'words'", am taking things one step further in generalization compared to
what he would refer to as g-language phenomena. Because previous to my
work, one had considered language to be grammar and grammar to be language,
at least in the academic setting, and even in Linguistics, as an academic
discipline.

Much of what I don't agree with in our current education and R&D landscape
has to do with the abuse of "p-language(s)". It's research greed coupled
with sentiment manipulation. Many who have studied linguistics (and many
philological studies) as well as computational linguistics (and possibly
natural language processing --- depending on which
traditions/assumptions/practices is/are being taught) are likely to have
been affected (I myself included). There was a stage in my life when I also
wanted to speak up / advocate for minority languages. Then I realized, it's
not really about the language (as texts or speech/sign data...).

[And yes, of course, there ARE the parts about the texts and speech/sign
data --- the documentation, the processing, the interpretation/description.
But it's not really what many/most people in CL/NLP or the "language space"
are doing. They are not seeing interpretation as description. They are
describing subjectively (mixing in some historical bias), then processing
with biases historical and personal (and emotional), and IF they are
interpreting at all, they are doing so based on a grammarian or "layperson"
persepective (i.e. not a scientific/technical one).

And also, of course, there is much truth-telling to do when it comes to
language attitudes and identity politics. For these could affect the
availability of data, how data "quality" is being perceived, among others.
But these do not have anything to do with how data is to be computationally
processed. For those who perform testing on humans, their studies could
reveal potential human biases effected by language attitudes etc..]

=====

2. Re "John loves Mary" etc.:
I think you are mixing in your expectation of what how a canonical form
should behave with its relevance.
Re "I am unable to imagine a scenario where "John loves Mary" could be the
same "Mary loves John"": they are both 15 characters long.
(As to how they can be different, one thing that has not been mentioned: we
can interpret/describe these strings with e.g. their transition
probabilities, character to character, character n-gram to another ---
there are a few who have worked on this, e.g. John Goldsmith (see his
youtube videos on MDL) and myself (though on a "word"-free manner). It
could give one some insights on the interpretation of phonological
phenomena.)

3. "Word" is not ill-defined. It is not definable, in ways that would be
necessary or sufficient for science, engineering, R&D.

4. Re "phonetics and phonology":
no, though having had p and p may help with understanding language in finer
granularities, it's not so much about p and p per se (or any academic
degree or courses). It's about having a scientific mindframe. Between
phoneticians, phonologists, morphologists, syntacticians,
semanticists/semanticians, pragmaticists or social linguists, phoneticians
(and some social linguists) were the ones who started using machines to
study language. Seeing language as an object of scientific
inquiry/investigation is key.

I am not writing what I wrote because I have had some "advanced
Linguistics", I am not trying to impose any academic superiority here. But
from what you wrote, I do see/sense there is a clearer demarcation between
science and language attitudes & identity politics possible.

Re "I don't think you do at all, based on your comments":
I do understand and empathize.
I might have just seen/experienced things from a different angle --- I have
seen how language attitudes / identity politics has been exercised in the
academic context (also industry), so to "broaden the practitioner pool" for
language-related technologies. It's almost like selling/promoting a less
legitimate form of currency. In that respect, I think there are things to
correct. (So I might come off a bit stern, but that should not be
misinterpreted as my lacking empathy/compassion.)

Re "no one has to adopt a purist attitude when it comes to
using/understanding of language by others.
Perfectly true. Did I even hint at that to the least degree? You are again
shadow boxing.":
I am agreeing with you by echoing, (re-)confirming/reformulating.

Re "But even when I started, I was not a purist in any sense of the word.":
Then how/where do you draw the line between p-languages, esp. outside the
context of computing? Some puristic ideas/ideals are there to back the
assumption/adoption of such naming.

Re "Why do you make assumptions as you comment on anything and hurl
semi-insults? You don't really know me. I didn't assume anything about
you.":
I think there might have been some misunderstandings. I wasn't discoursing
with you with prejudice. In fact, I have been trying to understand you
better. Understanding readers' perspectives is important for authors. As
you also identified yourself as a seasoned CLer, I thought to draw the
connection between your views and some of what I've observed from "the CL
population" (e.g. as exemplified by some dominant (and implicit)
assumption/interpretation about the state of our art/discipline/area (or
whether there is one)).

I did and do know about the language situation in India (e.g. linguistic
hegemony) --- I may not have witnessed things in ways you have, or be able
to describe things with as many details or report on the group sentiment
(may it be perceived or collectively acted upon or that which ends up
shaping the social infrastructure). But there are many parallels in
language phenomena. The crux of the matter lies not in language --- that's
a message I've been trying to get across. (It took me a while to come to
terms with that.)


On Mon, Aug 14, 2023 at 9:16 AM Anil Singh <[email protected]> wrote:

> On Fri, Aug 11, 2023 at 9:52 PM Ada Wan <[email protected]> wrote:
>
>> Just a quick reply before the weekend to some of the points that I
>> thought deserve a short clarification:
>>
>> 1. re linguistic empowerment: yes and no. As I commented on X (formerly
>> Twitter) on 09Aug2023: "[t]here are divergent ways of thinking... but when
>> it comes to language and the social sciences, one must be careful with how
>> one "diverges"! Humans and [sic: are] humans. And much of what we postulate
>> re "in-group/out-group" can be a matter of our "traditions" (if so, is it
>> time to re-evaluate?), perspectives (if so, can we be biased sometimes?),
>> or our willingness to include or will to exclude. How different can
>> particular "languages" (in a folk psychological, proverbial usage) be,
>> really? Where do the differences lie?".
>>
>
> Of course. That goes without saying. For almost 40 years now I have been
> looking at this issue and thinking about it. I had wanted to write a book
> about it. This is what got me into languages and then NLP, because I was
> then hooked by the study of language by itself, even without the issue of
> linguistic empowerment, particularly from the computational point of view.
> I have looked at it from all possible points of view. I was not a
> linguistic purist even when I began -- with a lot of bitterness -- and I am
> certainly not that now. I have no doubt at all that there is something
> universal and species-specific about human languages, although I don't know
> in what way it is universal exactly. No one does, as far as I know. No one
> could be more against linguistic chauvinism of any kind than me. Or any
> other kind of chauvinism.
>
>
>> The same goes with "diversity" efforts. AND OF COURSE I am NOT AGAINST
>> these.
>>
>
> I believe that. I didn't say you were. I just gave an example to make my
> point.
>
>
>> But one has to be careful how far one goes with "difference(s)".
>>
>
> Only as far as is reasonable and fair to everyone.
>
>
>> 2. Re "John loves Mary" being "same/different" as "Mary loves John": it
>> depends. Note stress/emphasis/topicalization, different usage pattern(s)
>> etc., not just "subj verb obj".
>>
>
> Well, yes, that is the central contradiction of Linguistics. It is
> primarily supposed to be about spoken language, but -- quite naturally --
> linguists in academic literature have to use examples in written form. And
> the written form misses "stress/emphasis/topicalization, different usage
> pattern(s) etc.". Being concerned with language for 40 years, how could I
> possibly not know it?
>
> However, I am unable to imagine a scenario where "John loves Mary" could
> be the same "Mary loves John", with any possible
> stress/emphasis/topicalization, different usage pattern(s) etc. for either
> of them and their combinations. It may be that I am missing something here.
>
>
>> 3. Btw, your usage of the term "word" can be replaced by other
>> alternative formulations, e.g. "term",
>>
>
> Let us terminate this terminological tussle about the term 'term', that is
> to say, the term 'word'.
>
> I have already more than once agreed that the term 'word' is ill-defined
> and that I have even written about it. In this case, you are indulging in
> what can be called shadow boxing.
>
>
>> 4. Re phonetics and phonology: I was not referring to the relevance of
>> phonetic/phonological knowledge per se, that a practitioner in the space of
>> "language and computing" would "need" in order to be competent. But that,
>> as well as a comprehensive knowledge of general language theories and a
>> broad background in p-languages and their (social/usage) contexts, belongs
>> in the toolkit of a good linguist (as in, a good language scientist). To
>> me, progressing to finer granularities is just refining our assumptions,
>> our model.
>>
>
> I mostly agree. Only mostly, since the statement above a somewhat vague
> programmatic statement. If the details were there, I could agree to
> specific things.
>
>
>> But to those who may not have had phonetics/phonology, they may be more
>> likely to think that they "need" "words" and hence my findings might be
>> either a paradigm shift or the end of the world.
>>
>
> So, you are still equating "having had phonetics/phonology", which can be
> translated as having formally attended and passed courses and exams in
> phonetics/phonology. I can't imagine how could you possibly talk about
> de-pedatization if you subscribe to this -- in my opinion -- somewhat
> ridiculous way of thinking. Pardon me for using strong words, but what you
> say is on the borderline of being offensive, if not actually offensive. And
> it is extremely silly and childish, coming from such a well-read person.
>
>
>> 5. Re triple quotes: """
>> I copied and pasted your reply that didn't seem to have been sent to the
>> list and put it in triple quotes, as a reference (for others).
>>
>
> OK.
>
>
>> 6. Re "Do you have any idea how much hundreds of millions of Indians
>> suffer simply from being forced to use English?": in what ways are they
>> "forced"?
>>
>
> I can't even begin to attempt to describe in innumerable ways people are
> forced to use English. There is tons of literature about that, but a lot of
> it may be non-European languages. For example, it is there in Hindi. The
> book that I always wanted to write, but for various reasons couldn't, at
> least so far, was partly about that.
>
> Just to mention a few examples. The medium of instruction in India,
> particularly for higher education, and exclusively for technical and
> scientific education, is in English. Every day hundreds of millions of
> people suffer due to that. The result is that a lot of people grow up with
> complexes and stunted intellect, as they couldn't understand what the
> teacher is saying, what is written in the books, and so on. When they come
> to college, a majority of people have problems writing one decent page of
> content in either their own language(s) or English.
>
> The legal system, particularly at higher levels, works in English. As a
> result, the overwhelming majority of people have no idea what is going on.
> They have to rely on others completely, some of whom themselves may not be
> very fluent in English.
>
> All the lucrative jobs require not only knowledge of English, but spoken
> fluency in English. Not only that, your accent while speaking English puts
> you in a particular caste, so to speak. As a result, an incompetent and
> badly educated person who speaks fluent English can get through life much
> more easily than a competent well-educated person with a 'bad' English
> accent.
>
> There is little incentive to write (and read) in Indian languages, and
> therefore it is very difficult to write and publish literary or academic or
> even other kinds of books in Indian languages.
>
> And so on and on and on.
>
> The challenge is that it is very difficult to solve this problem, since
> there are many major languages in India, and so speakers of one language
> will not accept 'imposition' of another Indian language, or even the
> requirement to learn another Indian language. As a result, just as the
> British ruled by Divide-and-Conquer, so English rules in this time tested
> way.
>
> And to top it all, if you want to be associated with the global
> economics/culture/world-at-large, you again need English.
>
>
>> But I can understand that.
>>
>
> I don't think you do at all, based on your comments.
>
>
>> The more important thing is to also understand that no one has to
>> discriminate based on language(s),
>>
>
> Sure! Who can disagree with that except a language chauvinist?
>
>
>> no one has to adopt a purist attitude when it comes to
>> using/understanding of language by others.
>>
>
> Perfectly true. Did I even hint at that to the least degree? You are again
> shadow boxing.
>
>
>> People can always have various language/linguistic habits, no one has to
>> use "one language only".
>>
>
> Again, did I even hint at that in any possible way?
>
>
>> The point is not to use language as a weapon.
>>
>
> Ditto as above.
>
>
>> [These are things I think you know, but many on this list may not.]
>>
>
> I sure do. I have been thinking and researching about these matters for
> the last 40 years, almost obsessively, from all possible points of view. My
> position on this issue has changed a great deal over the years. But even
> when I started, I was not a purist in any sense of the word. I will always
> be against forcing people to do things they don't want to do.
>
>
>> Re "Do you know that there are and have been schools in the world,
>> including India, where students are punished if they are caught speaking in
>> their mother tongue (or first, "native" language)": is this still happening
>> in India? I've only had similar experiences in my "foreign language"
>> lessons and in real life (using a variety/style y when/where y can be
>> "frowned upon") --- though not "punished", just looked upon
>> with 🙄 or 👀 in ways condescending.
>>
>>
> The last time I checked, it was happening. As of this moment, I don't know
> for sure. But, as pointed out above, innumerable people do suffer in
> innumerable ways due to the supremacy of English in India. I may have some
> suggestions, but I don't really know the solution to this issue, as it is
> complicated by so many factors. I will never ever support forcing people to
> use one or the other language.
>
> Why do you make assumptions as you comment on anything and hurl
> semi-insults? You don't really know me. I didn't assume anything about you.
>
> For example, if I have understood correctly, you simply wanted to say that
> one should pay attention to stress/emphasis/topicalization (I will add,
> from my side, prosody and intonation) when considering the meanings of the
> sentences "John loves Mary" and "Mary loves John" (note that there is no
> question mark here at the end). And you went about it by first saying "let
> me guess" and then making some silly statements about my competence and
> expertise about language(s). You could have simply mentioned the importance
> of stress/emphasis/topicalization in the beginning. I can't imagine any
> reason why you have to make such assumptions.
>
> Well-read and well-educated as you are about language(s), perhaps it is
> possible, even if only remotely, that I could have a thing or two that I
> could tell you about language that you might not perhaps know?
>
>
>> Great weekend!
>>
>>
>
_______________________________________________
Corpora mailing list -- [email protected]
https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
To unsubscribe send an email to [email protected]

Reply via email to