Re: [R] subset English language using textcat package

Robert David Burbidge via R-help Mon, 19 Nov 2018 03:14:00 -0800

Look at the help docs and examples for textcat and sapply:


print(as.character(data$x[sapply(data$x, textcat)=="english"]))

Although textcat defaults classify "This book is amazing" as dutch, soyou may want to read the help for textcat and change the profile db("p") or "method".


On 19/11/2018 09:48, Elahe chalabi via R-help wrote:

Hi all,

How is it possible to subset English text from a df containing German and 
English texts using textcat package?



     > library(textcat)
     > dput(data)
     structure(list(x = structure(c(2L, 6L, 5L, 3L, 1L, 4L), .Label = c("Dieses Buch 
ist erstaunlich",
     "I love this book", "ich liebe dieses Buch", "mehrere bücher in prozess",
     "several books in proccess", "This book is amazing"), class = "factor")), 
row.names = c(NA,
     -6L), class = "data.frame")

I want the output to be like the following:


     "I love this book"  "This book is amazing"  "several books in proccess"


Thanks for any help!
Elahe


______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] subset English language using textcat package

Reply via email to