Jeff,
Here is the solution:
myTerms <- c("prostatic", "adenocarcinoma", "grade")
inspect(DocumentTermMatrix(docs, list(dictionary = myTerms))) ## only returns
from first 10 docs in DTM
as.matrix(DocumentTermMatrix(docs, list(dictionary = myTerms))) ## returns
from all docs in the DTM
Patrick Casimir, PhD
Health Analytics, Data Science, Big Data Expert & Independent Consultant
C: 954.614.1178
________________________________
From: Jeff Newmiller <[email protected]>
Sent: Friday, May 19, 2017 11:04:22 AM
To: [email protected]; Patrick Casimir; [email protected]
Subject: Re: [R] PROBLEM USING DICTIONARY WITH TM PACKAGE
Considering the deafening silence after three repeats, one explanation could be
that you are asking the wrong group of people. It is also possible that your
failure to follow the Posting Guide with regard to using plain text email and a
reproducible example [1][2] means that readers who are not experts do not feel
inclined to follow along with you and help you think of solutions. Keep in mind
that supporting contributed packages like tm is technically not on topic here,
though people often do feel the urge to help solve problems with them anyway.
With regard to asking the wrong group of people I would suggest asking the
maintainer of the tm package what they recommend. See the help for the
maintainer function or read the CRAN Web page for that package.
[1]
http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example
[2] http://adv-r.had.co.nz/Reproducibility.html
--
Sent from my phone. Please excuse my brevity.
On May 19, 2017 7:12:45 AM PDT, Patrick Casimir <[email protected]> wrote:
>Dear Members & Experts,
>
>
>Since the Dictionary () function is no longer available with the tm
>package. How do I use other functions to do the same as below? I want
>to capture a list of specific terms from a corpus. By example, if my
>corpus has 102 files. I want to see a list with occurrences of
>prostatic, adenocarcinoma, grade in all 102 files. When I use the
>function Dictionary (), I got the error: Error: could not find function
>"Dictionary"
>
>
>> d <- Dictionary(c("prostatic", "adenocarcinoma", "grade"))
>> inspect(DocumentTermMatrix(docs, list(dictionary = d)))
>
>
>But if I use the codes below using inspect, the dictionary only returns
>the terms for 10 files instead of 102. I need a way to get my
>dictionary to capture and return those terms for all 102 files or
>whatever other terms I select. I know I am close but inspect () is not
>the right function.
>
>
>> myTerms <- c("prostatic", "adenocarcinoma", "grade")
>> inspect(DocumentTermMatrix(docs, list(dictionary = myTerms)))
>
> <<DocumentTermMatrix (documents: 102, terms: 3)>>
> Non-/sparse entries: 292/14
> Sparsity : 5%
> Maximal term length: 14
> Weighting : term frequency (tf)
> Sample :
> Terms
> Docs adenocarcinoma grade prostatic
> Patient14.txt 11 6 3
> Patient15.txt 7 12 2
> Patient16.txt 13 16 4
> Patient19.txt 5 13 2
> Patient24.txt 11 12 4
> Patient25.txt 8 9 4
> Patient41.txt 8 10 4
> Patient46.txt 8 10 3
> Patient8.txt 9 12 2
> Patient9.txt 8 23 2
>
>
>Thanks
>
>
>
>Patrick Casimir, PhD
>Health Analytics, Data Science, Big Data Expert & Independent
>Consultant
>C: 954.614.1178
>
>
>
> [[alternative HTML version deleted]]
>
>______________________________________________
>[email protected] mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.