On Fri, Jun 30, 2023, 10:32 AM Emanuel Berg <in...@dataswamp.org> wrote:
> Nicholas Geovanis wrote: > > >>>>> If you have python programming skills, you might > >>>>> consider NLTK > >>>> > >>>> Unbelievable if there are no such tools anywhere already, > >>>> but I don't have one either so maybe there aren't then? > >>> > >>> There's a big subject called computational linguistics. > >>> They have some specialized tools for what they call corpus > >>> analysis. Because you mentioned statistics you threw > >>> everyone off :-) And I really like R. > >> > >> Okay, so now we are getting somewhere. The technical term > >> and scientific field of this activity is known as > >> computational linguistics, and the guys that do that do > >> corpus analysis. Sweet! > > > > Two standard text books are Foundations of Computational > > Linguistics by R Hausser, and Computational Linguistics: An > > Introduction by R Grishman. > > > > Syntactical analysis of human and artificial (programming) > > languages is well known. But how do you attach meaning to > > the symbols? Semantics. How do you identify style and > > emphasis? These are the kind of starting points for > > computational linguistics. > > Okay, but do we have software in the Debian repositories, or > anywhere else in the Unix and FOSS world for that matter, so > we can try it out in practice? > Those books teach and discuss some of the software that's used. I doubt you will find them in debian's repositories. Of course you can do plenty of computational linguistics with perl or python which you already have. What is a "regular expression" which is at the heart of perl and python? An expression which conforms to a certain type of grammar. Perl and python are used directly for analyzing text (any old language). You are learning basic computational linguistics. -- > underground experts united > https://dataswamp.org/~incal > >