[Corpora-List] Re: Any literature about tensors-based corpora NLP research with actual examples (and homework ;-)) you would suggest? ...

Hugh Paterson III via Corpora Wed, 02 Aug 2023 13:40:04 -0700

Dear Ada,

I think I am agreeing with you in terms of finding the right labels for the
scientific units of reference. I have always wondered why computational
linguists have not just simply called these units "strings".


Kind regards,
Hugh

On Wed, Aug 2, 2023 at 11:12 AM Ada Wan via Corpora <[email protected]>
wrote:

> Re RML or any "text technologies" leveraging "grammar" (misnomer or not):
> it is not the right time right now to be "campy" about (as in, to be
> arguing/protesting for) "grammar", at the moment, esp. if you do not have a
> background in Linguistics.
> There has been quite some abuse/misconduct with concepts/units/assumptions
> such as "words", "sentences", and "grammar" in the language space (with or
> without computational implementation).
>
> The priority of my communications here is to clarify the part on the
> scientific front, to make sure that if one happens to have gotten oneself
> involved in this space, how one can come to more clarity on the status quo,
> esp. given my results. There is a lot that needs to be re-evaluated and
> re-interpreted. Simply stating that something might have been useful in the
> past is not going to be helpful with going forward.
>
> If one is working in technologies with language/text data (e.g., in a
> user-based format/framework, and not working on "grammar" as a
> "linguistic"/philological pursuit), it is recommended that the name(s) of
> such technologies get updated --- if "grammar" [1] does not have to be
> mentioned or be involved, don't.
> [1] or, including but not limited to any of the following: "word",
> "sentence", "linguistic structure(s)", "meaning", "morphology", "syntax",
> "parsing", various terms related to parts of speech (e.g. "nouns",
> "verbs")....
>
> Re "BTW, regarding that "parsing" aspect, what is the term used to
> describe the gradual process of "terminological inception"?":
> conceptualization? Coining of terms?
> According to me, "lexical priming" is different from "terminological
> inception".
>
> Re "How could you clarified intersubjectivity?":
> https://en.wikipedia.org/wiki/Intersubjectivity :)
> Your question is way too broad, or requires an answer that is such, which
> I cannot entertain at the moment.
>
> Thanks for sharing your perspectives. I must admit I have not had time to
> digest all of your points. But this impression recurred in me as I was
> reading them:
> sometimes, I sense that when one claims some concepts are not universal
> (e.g. the ones mentioned in [1] above), others take it as that all concepts
> are categorically invalid. That is not what I intended to communicate (with
> all my papers, scientific work, and my comments here). It is an expert
> opinion/finding that I shared, upon some careful evaluation.
>
>
> On Tue, Aug 1, 2023 at 10:26 PM Albretch Mueller via Corpora <
> [email protected]> wrote:
>
>> On 7/31/23, Ada Wan <[email protected]> wrote:
>> > That having been expressed, here are a couple of points re RML that one
>> should pay heed:
>> > i. to what extent and in what context is this a technology relevant?
>>
>>  If you were able to device an algorithm which taking as input only NL
>> texts (composed of: 1) a start (semantic end); b) a sequence of
>> characters from a relatively large and representative text bank; c) an
>> end (a semantic start)) is able to exhaustively "deduce" the grammar
>> of such texts, in addition to being able to use it with any language,
>> you would then:
>>
>>  1) have defined a "space"/"coordinate system" for those texts, to
>> frame (pretty much) all possible "meaningful 'points'"/"phrases" in
>> terms of such grammar, which would also;
>>  2) be a 0-search structure describing the text bank/corpus (every
>> text segment would also become a pointer to every single actualization
>> of that very segment in all texts, no more "n-grams" necessary!),
>> which could;
>>  3) be used with minimal turking/supervision to:
>>  3.1) cleanse up all automatic translations from youtube;
>>  3.2) keep multilingual corpora;
>>  3.3) use it for automatic translations (demonstrably, in an almost
>> foolproof, perfect way, since you always have the words/phrases with
>> their context);
>>  3.4) "cosmic/tree reading": instead reading books/sequences of
>> characters, you would read that text as it relates to all other texts
>> from the same topic;
>>  3.5) parsing: you would keep a corpus of what you know so you wont
>> have to reread about certain topics and aspects you already know
>> (great Lord! how I hate reading a whole book to only find a few, at
>> times marginal, sentences worth reading! or that "youthful" thing of
>> thinking that they just discovered/created an idea because they are
>> just verbalizing it or made a movie about it!) BTW, regarding that
>> "parsing" aspect, what is the term used to describe the gradual
>> process of "terminological inception"? I have heard the term
>> "Adamization", but, even though that word doesn't really rub me the
>> wrong way, I could imagine it is "too sexist" to some people. I
>> wouldn't really care calling it Eveization or "pussyfication" or
>> whatever. I just don't want to use the term that the government uses:
>> "lexical priming" and "terminological inception" sounds too cumbersome
>> as a verb: "terminologically incept"? doesn't sound OK in English;
>>  3.6) of course, an easy application of that contextual parsing would
>> be removing all that js crap and ads before they reach your awareness;
>>  ...
>>  3.n) not last and definitely not least I am thinking hard about how
>> to make sure police and politicians at least have a hard time while
>> using what I have described to "freedom love" people (I know, I know,
>> ... "3.n" doesn't "technically" pertain to quality of implementation
>> issues ..., but I, for one, disagree. Giving the "all tangible things"
>> (tm) panopticon in which we are all living these days, each of us in
>> one's own "virtual prison cell" to call it somehow, we should also
>> think about, be openly honest about such matters)
>>
>>  I am working right now on such Leibnizian "characteristica
>> universalis" kind of thing. First cleansing approx. 1.2 million texts
>> mostly from archive.org, *.pub and the NYS Regents exams
>> (nysedregents.org + nysl.ptfs.com) which they have, at least
>> partially, translated to more than 10 languages. Is that relevant
>> enough to you? ;-) I am also being quite selfish about it because I
>> have always dreamed of being able to "read"/mind all texts which have
>> ever been written in the same way that teens think they have to have
>> sex with everybody in town to make sense of things.
>>
>> > ii. one can certainly dissect/decompose texts ...
>>
>>  Computing power has become insanely cheap, but it has also enabled
>> too much "cleverhansing" out there. The Delphic phrase: "you can make
>> sense or money" these times translates as some sort of corollary to:
>> "using computers and then thinking about it makes you smart"; but,
>> does it really?
>>
>>  It amazes me how easily you can "dissect"/"decompose texts", talk
>> about "tensors", "vectors", ... (I am not trying to police language
>> usage, it just amazes me); let alone all the insufferable bsing claims
>> by the "Artificial Intelligentsia".
>>
>>  I would go with one character after the other and an open attempt to
>> use the minimal amount of principles to then see what I get. IMO, when
>> you start getting too smart about what you do, of course, you will
>> "see" how smart you are. The poet in me likes Borges' stanzas: "... el
>> nombre es arquetipo de la cosa, en las letras de 'rosa' está la rosa y
>> todo el Nilo en la palabra 'Nilo'" ("its name is a thing's archetype,
>> in the letters of 'rose' is the rose and the whole of the Nile (river)
>> in the word 'Nile'")
>>
>> > II. Re ""magical" in the sense that when we go about our
>> intersubjective business": some intersubjectivity can be further clarified.
>> I don't see much of your examples as being "magical".
>>
>>  I actually do! How could you clarified intersubjectivity? I am trying
>> to do so (somewhat) Mathematically (to the extent you could). Could
>> you share any papers, "prior art" on such matters?
>>
>> > ii. "other people may read, mind, as well ...;": so?
>>
>>  which is a good thing it is alright, fine and dandy in the hippie way, I
>> meant.
>>
>> > iii. "Alice bought some veggies from Bob, ...)": this I don't
>> understand.
>> > iv. "We see more in money ("words", ...) than just a piece of paper"
>>
>>  iii. and iv. overlap to some extent so I will try to explain them
>> both quickly (which is impossible since you can write philosophies
>> about each line, but there I'll go). To understand what Marx (may
>> have) meant by „gesellschaftlich notwendige Arbeit” ("socially
>> necessary labour time", wording which has made quite a few go berserk
>> ever since):
>>
>>  https://en.wikipedia.org/wiki/Socially_necessary_labour_time
>>
>>  https://en.wikipedia.org/wiki/Transformation_problem
>>
>>  you have to understand the basic mathematical concepts of:
>>
>>  a) combined rates, and
>>  b) intratextual systems of linear equations
>>
>>  Based on my teaching experience §b is easier to understand. Sorry I
>> couldn't find an "easier" explanation on youtube of that type of SLEs
>> than the one I used with my students preparing for the Regents:
>>
>>  https://ergosumus.files.wordpress.com/2018/10/sle04-en.pdf
>>
>>  the intratextuality of those problems matter to corpora research
>> because different strata of "like terms" ("verbs", "adjectives", ...)
>> is what creates grammar. "Crazy me" thinks you could to some extent
>> describe the "likeness of terms" underlying grammar!
>> ~
>>  I also have a guideline about combined rates which I successfully
>> used with my students:
>>
>>  https://ergosumus.files.wordpress.com/2018/06/word_problems12-en00.pdf
>> ~
>>  What the eff do combined rates and SLEs have to do with Marx'
>> transformation problem? ;-)
>>
>>  Well, notice that the -equitable aspect- used to solve combined rates
>> problems is the time (regardless of how differently fast one "works"
>> in comparison with others). There is also another type of combine rate
>> problems: you drive to some place with a friend who doesn't care about
>> driving fast, but you need to rest so she drives for a while ... that
>> problem is different from two people meeting at a place each driving
>> "on their own cars" (at their own average speed).
>>
>>  Serge Heiden shared a paper about presidential debates which could be
>> also Mathematically studied as a CR kind of problem (even if
>> politicians as the crowd management clowns they all are don't have to
>> make sense, anyway), but as it happens with any dialogue there are
>> parts of the conversations in which both the cars and the time is
>> shared and other times when only (or more of) the time. I don't know
>> of a general Mathematical formulation to CRs kinds of problems, which
>> could be used for corpora research. On my "to do" list I have writing
>> papers studying Euclid's Elements and Plato's Dialogues in that way.
>>
>>  Karl Marx's as part of his „Wertgesetz der Waren” (reChristened in
>> English as "labor theory of value") somewhat metaphorically stated
>> that the exchange value of a commodity is a function of "society's
>> labour-time". He also rendered his ideas as equations (in more of a
>> verbally descriptive, metaphorical way), but that phrase: "society's
>> labour-time", was and is still found from questionable to
>> unfalsifiably wild. I don't claim to have mind reading powers, but I
>> think in his letter to his friend Ludwig Kugelmann, the thoroughgoing
>> Hegelian Marx was, he clearly explained what he meant (page: 222 in
>> file, 208 in book):
>>
>>
>> https://archive.org/download/marxengelsselectedcorrespondence/Marx%20%26%20Engels%2C%20Selected%20Correspondence.pdf
>>
>>  Marx To Ludwig Kugelmann In Hanover London, July 11, 1868:
>>  All that palaver about the necessity of proving the concept of value
>> comes from complete ignorance both of the subject dealt with and of
>> scientific method. Every child knows that a nation which ceased to
>> work, I will not say for a year, but even for a few weeks, would
>> perish. Every child knows, too, that the masses of products
>> corresponding to the different needs require different and
>> quantitatively determined masses of the total labour of society. That
>> this necessity of the distribution of social labour in definite
>> proportions cannot possibly be done away with by a particular form of
>> social production but can only change the mode of its appearance, is
>> self-evident. No natural laws can be done away with. What can change
>> in historically different circumstances is only the form in which
>> these laws assert themselves. And the form in which this proportional
>> distribution of labour asserts itself, in a state of society where the
>> interconnection of social labour is manifested in the private exchange
>> of the individual products of labour, is precisely the exchange value
>> of these products.
>> ~
>>  So, as I see it, in a Hegelian way, Marx was seeing the whole of
>> society as a corpus (in which we all live through our own
>> texts/narratives) talking about "socially necessary labour time" in
>> the way that "time" becomes the equitable aspect shared when
>> people/(-society as a whole-) work together as described by combined
>> rates kinds of problems.
>>
>>  When "Alice buys some veggies from Bob, ..." she used money as
>> "equitable aspect" to get Bob's veggies (in the Marxian way they were
>> both part of a combine rates problem) and you tell me this is not
>> magical!
>>
>> > v. "some transactional electronic ("air"...) excitations": I don't get
>> this.
>>
>>  you may pay with cash using coins or bills or using your debit card
>> which at the end of the day become transactional electronic
>> excitations on some hard drives. When you speak there is more to it
>> than vibrations/fluctuations of air. (I am referring to the medium
>> which Saussurean signifiers use)
>>
>> > vi. "your 'magic' and mine are different we are still able to
>> 'communicate'. How on earth do such things happen?": a disclaimer: I am not
>> using any magic in my attempts to communicate with you here. I try my best
>> to place myself in your shoes to guesstimate the points that you are trying
>> to get across. But many (as you can see above) didn't quite reach me.
>>
>>  "I try my best to place myself in your shoes" ... ;-) Ha, ha, ha!
>> that is just a functional illusion. What do you know about "my shoes"?
>> I work as a gardener (which I love to do) so they are dirty and
>> smelly, ... I also love to eat garlic ... As I see things standing on
>> "my dirty and smelly shoes and voicing it from my garlicky mouth"
>> being honest and true to matters is good enough.
>>
>>  lbrtchx
>> _______________________________________________
>> Corpora mailing list -- [email protected]
>> https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
>> To unsubscribe send an email to [email protected]
>>
> _______________________________________________
> Corpora mailing list -- [email protected]
> https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
> To unsubscribe send an email to [email protected]
>

_______________________________________________
Corpora mailing list -- [email protected]
https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
To unsubscribe send an email to [email protected]

[Corpora-List] Re: Any literature about tensors-based corpora NLP research with actual examples (and homework ;-)) you would suggest? ...

Reply via email to