I forgot the main reason for writing the last email. Most importantly, I share your view that orthography is underrepresented in NLP/CL. I had once tried to build a computational typology of writing systems. The paper was not published, but I still believe that is something worth doing. Perhaps one day I will complete that work.
Also, I am conscious that, technically, I used the term category mistake in a wrong way, but I hope I was understood correctly. On Sat, Aug 5, 2023 at 12:47 AM Hesham Haroon <[email protected]> wrote: > Hi Ada and Anil, > > I'm enjoying reading your discussion. It's been very informative and > thought-provoking. Thanks for sharing your insights! > > Best, > Hesham > > > On Fri, Aug 4, 2023, 8:51 PM Anil Singh via Corpora < > [email protected]> wrote: > >> I have been enjoying the discussion. I hope it will continue. I have >> learnt some new things. I was also confused about the tensor thing, >> although not in the same way. >> >> I hope I am not among one of the scare quoted NLP practitioners, because >> that's exactly what I like to call myself. I certainly don't think I am >> qualified to work on language just because I can speak one. >> >> I am currently reading your thesis and trying to digest it. >> >> I also glanced through the syllabus you are preparing. I share your >> interest in text encodings. among other things. I can't resist talking >> about text encodings, whether I am teaching NLP or Computer Programming, >> because I know first hand the problems in doing NLP for low resource >> languages which are related to text encodings. >> >> If you can actually teach that syllabus, I envy you as I am unable to get >> people interested in the very basics of language/linguistics. >> >> About the importance of granularities, I had, in my (very badly written) >> PhD thesis, explicitly talked about NLP problem formulation in terms of >> granularities. In my second research paper, I had used byte n-grams for >> language identification. I use byte n-grams whenever I can. Actually, I >> used it for language-encoding pair identification, as there are so many >> non-standard 'encodings' which were used and perhaps are still used for >> South Asian languages. My very first -- unsuccessful or you may say >> unfinished -- attempt at doing some kind of NLP even before knowing that a >> field called NLP or CL existed, was on building an encoding converter that >> will work for all 'encodings' used for Indian languages. I too wish there >> was a good comprehensive history text encodings, including non-standard >> ad-hoc encodings. >> >> I also share your interest in word level language identification. In 2007 >> I had published one of the earliest papers on what I called language >> identification in a multilingual document, where I had tried word level >> language identification, and what is now called language identification for >> code switched data. >> >> About gender, I had actually made a kind of category assumption. I didn't >> pay attention to the name, which you share with no less than Ada Byron. >> >> We have to be tolerant of what you call bad research for various >> unavoidable reasons. Research is not what it used to be. At least that's my >> opinion. Still, in some ways it is better, perhaps like in the case of >> gender representation. >> >> About grammar, I have come to think of it as a kind of language model for >> describing some linguistic phenomenon. I once received a review in which >> the reviewer mentioned some grammatical mistakes and wrote that you don't >> have to just see how the sentence/phrase sounds, you have to explicitly >> check the grammar according to the rules. Thank you very much, but I learnt >> English without paying any explicit attention to grammar. I am pretty sure >> I didn't learn much from explicit teaching of grammar, whether of English, >> or of Sanskrit, or of French.That doesn't necessarily mean I don't believe >> in grammar, but I guess I am moving towards the language games view of >> language. >> >> As to language being magical, well, that depends on what you mean by >> magical. To me, it seems it is magical in the same sense as life itself is >> magical. Nothing more, nothing less. Even computer programming I have been >> known to call magical in a certain sense. >> >> I also completely agree that we can only hope that we are communicating >> as we intended, but we rarely, if ever, actually attain that goal. >> >> I can't match your background, but I did have -- what can be called -- >> four rounds of graduate training in different disciplines. I am still >> trying to learn new things about language. However, I have no experience of >> field work at all and that I regret, but it is partly because I am not a >> social creature, or, to be more precise (as if one can be precise with >> language), I am socially totally incompetent. I wouldn't know how to >> approach anyone for fieldwork in Linguistics. >> >> On Fri, Aug 4, 2023 at 9:03 PM Ada Wan via Corpora < >> [email protected]> wrote: >> >>> @Toms: >>> for completeness' sake: would you mind please sharing your background? >>> Thanks. >>> >>> On Fri, Aug 4, 2023 at 5:31 PM Ada Wan <[email protected]> wrote: >>> >>>> Thanks x2, Ibrtchx. >>>> >>>> On Fri, Aug 4, 2023 at 3:30 AM Albretch Mueller <[email protected]> >>>> wrote: >>>> >>>>> On 8/3/23, Toms Bergmanis <[email protected]> wrote: >>>>> ... >>>>> >>>>> I, for one, have benefited from Ada's, as well as other member's >>>>> suggestions and comments as I hope they have somehow benefited from >>>>> mine. >>>>> lbrtchx >>>>> >>>> _______________________________________________ >>> Corpora mailing list -- [email protected] >>> https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/ >>> To unsubscribe send an email to [email protected] >>> >> >> >> -- >> - Anil >> _______________________________________________ >> Corpora mailing list -- [email protected] >> https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/ >> To unsubscribe send an email to [email protected] >> > -- - Anil
_______________________________________________ Corpora mailing list -- [email protected] https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/ To unsubscribe send an email to [email protected]
