[Corpora-List] Programming for Corpus Linguistics with Python and Dataframes

Susan Hunston via Corpora Tue, 28 May 2024 04:41:22 -0700

Dear list members,
I'm delighted to announce a new publication in the Cambridge Elements in Corpus 
Linguistics series. This publication is FREE to download until 10 June 2024 
(see the link at the bottom of this email).
Title: Programming for Corpus Linguistics with Python and Dataframes
Author: Daniel Keller, Western Kentucky University
Summary: This Element offers intermediate or experienced programmers algorithms 
for Corpus Linguistic (CL) programming in the Python language using dataframes 
that provide a fast, efficient, intuitive set of methods for working with 
large, complex datasets such as corpora. This Element demonstrates principles 
of dataframe programming applied to CL analyses, as well as complete algorithms 
for creating concordances; producing lists of collocates, keywords, and lexical 
bundles; and performing key feature analysis. An additional algorithm for 
creating dataframe corpora is presented including methods for tokenizing, 
part-of-speech tagging, and lemmatizing using spaCy. This Element provides a 
set of core skills that can be applied to a range of CL research questions, as 
well as to original analyses not possible with existing corpus software.


The Element can be accessed using this link: 
https://doi.org/10.1017/9781108904094.

Susan Hunston   (she/her)
Professor of English Language
+44 121 414 5675
University of Birmingham
Department of English Language and Linguistics
www.birmingham.ac.uk

_______________________________________________
Corpora mailing list -- [email protected]
https://list.elra.info/mailman3/postorius/lists/corpora.list.elra.info/
To unsubscribe send an email to [email protected]

[Corpora-List] Programming for Corpus Linguistics with Python and Dataframes

Reply via email to