I am sorry, I forgot to mention that the strings in my example are Turkish
words, and the lemma AĞAÇ- corresponds to the lemma TREE- in English, which is
a label that allows me to talk about the only meaningful thing that is common
to the set {tree, trees}.
Best,
Orhan
On 17 Oct 2023 20:44, "Bilgin, Orhan (Postgraduate Researcher)"
<[email protected]> wrote:
Dear Ada,
I agree that lemmatisation is a construct and is not a universal method for
linguistic analyses, but I don't understand why it is imperative that I wean
myself from using lemmas.
What is it that restricts my freedom to invent the lemma (a non-universal
construct) AĞAÇ-, for example, to refer to the one and only "meaningful thing"
that is common to the very many (theoretically infinite, practically probably
around 10,000) strings including ağaç, ağacı, ağaca, ağaçlar, ağacımızdaki,
ağaçlandırılabilmesinden, ağaçsızlaşmasını, etc. etc.? How (and why) am I
supposed to talk about that very large set without using a label for it?
Best,
Orhan Bilgin
On 17 Oct 2023 18:36, Ada Wan via Corpora <[email protected]> wrote:
This email originated outside the University. Check before clicking links or
attachments.
Dear Christian
Re your PS:
one doesn't need to debate the use/future of lemmatization, though I'd welcome
such as part of scholarship. For those experienced in matters in/of
Linguistics, it should be clear that lemmatization was simply a cconstruct, a
entry-level philological exercise (esp. for those from Computer Science with
less of a background in Linguistics and language(s)). It has been sad that some
have picked up the habit of using lemmatization as a heuristic (though for
what, specifically?) and might have become, apparently, too addicted to it to
let it go. It is imperative that one weans themselves from such habit.
Methods for linguistic morphology, e.g. (morphological) parsing or stemming,
are not a universal decomposition scheme, nor a universal method for
language/linguistic analyses. Also important is to bear in mind is that neither
linguistic morphology nor lemmas/lemmata doesn't/don't have that long of a
history.
Thanks for being open-minded enough to read this far.
Best
Ada
_______________________________________________
Corpora mailing list -- [email protected]
https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
To unsubscribe send an email to [email protected]