On Fri, Jun 30, 2023, 10:32 AM Emanuel Berg <in...@dataswamp.org> wrote:

> Nicholas Geovanis wrote:
>
> >>>>> If you have python programming skills, you might
> >>>>> consider NLTK
> >>>>
> >>>> Unbelievable if there are no such tools anywhere already,
> >>>> but I don't have one either so maybe there aren't then?
> >>>
> >>> There's a big subject called computational linguistics.
> >>> They have some specialized tools for what they call corpus
> >>> analysis. Because you mentioned statistics you threw
> >>> everyone off :-) And I really like R.
> >>
> >> Okay, so now we are getting somewhere. The technical term
> >> and scientific field of this activity is known as
> >> computational linguistics, and the guys that do that do
> >> corpus analysis. Sweet!
> >
> > Two standard text books are Foundations of Computational
> > Linguistics by R Hausser, and Computational Linguistics: An
> > Introduction by R Grishman.
> >
> > Syntactical analysis of human and artificial (programming)
> > languages is well known. But how do you attach meaning to
> > the symbols? Semantics. How do you identify style and
> > emphasis? These are the kind of starting points for
> > computational linguistics.
>
> Okay, but do we have software in the Debian repositories, or
> anywhere else in the Unix and FOSS world for that matter, so
> we can try it out in practice?
>

Those books teach and discuss some of the software that's used. I doubt you
will find them in debian's repositories. Of course you can do plenty of
computational linguistics with perl or python which you already have.

What is a "regular expression" which is at the heart of perl and python? An
expression which conforms to a certain type of grammar. Perl and python are
used directly for analyzing text (any old language). You are learning basic
computational linguistics.

-- 
> underground experts united
> https://dataswamp.org/~incal
>
>

Reply via email to